Project Overview
The migration process included ingesting, transforming, and storing data. The workflow was divided into three main layers: Data Ingestion Layer (DIL), Data Transformation Layer (DTL), and Business Intelligence Layer (BIL).
Key Highlights:
-
Technology Stack: GCP Data Fusion, Data Prep, Big Query, Cloud Functions, Power BI, Python.
-
Optimization: Photon-enabled jobs reduced runtimes; SQL-based transformations replaced complex wrangler transformations.
-
Automation: Inbound triggers and hierarchical run plans automated data pipelines for seamless execution.
-
Validation: Comprehensive checks (row counts, column formats, sums) automated with dynamic views and stored procedures in Big Query.
-
Results: Reduced cluster runtimes, delivered daily reports in Power BI, and built a scalable architecture with potential for advanced analytics integration.