Develop a Modern Azure Data Platform: Design, build, and optimize end-to-end data solutions with Azure Data Factory, Azure Data Lake Storage, Databricks, and Azure Synapse Analytics.
Create Data Pipelines: Conceptualize and manage scalable ETL/ELT processes in PySpark and Spark, focusing on data quality and reliability.
Real-Time Data Capture: Implement pipelines that capture source system changes in real time, handling Slowly Changing Dimensions to maintain accurate historical data.
Integrate New Technologies: Stay updated on the latest trends in Cloud Data Engineering, Big Data, and Analytics, integrating relevant tools for better performance, cost-efficiency, and scalability.
Efficient Reporting with Power BI: Collaborate with BI teams to optimally structure large data models and develop impactful dashboards.
Data Governance and Compliance: Ensure compliance with data protection regulations and internal security policies, establishing best practices for metadata management, data lineage, and access controls.
What You Bring:
Azure Expertise: Several years of practical experience with Azure Data Factory, Azure Data Lake Storage, Databricks, and Azure Synapse Analytics.
PySpark and Spark: Comprehensive knowledge in developing distributed, high-performance data processing solutions.
ETL Processes: Deep understanding of ETL/ELT concepts and SCD/CDC methods.
SQL Skills: Proficient in complex SQL queries and their optimization.
Data Formats: Practical experience with Parquet or Avro.
Power BI: Experience working with large data models and creating Power BI visualizations.
Analytical Skills: Detail-oriented, solution-focused mindset with strong problem-solving abilities.
Language Skills: Fluency in English is required; knowledge of German is an advantage.