Design, build, and maintain scalable data pipelines (ETL - Extract, Transform, Load) from various sources (structured and unstructured databases) using Airbyte and Airflow.
Optimize PostgreSQL and Google BigQuery to enhance data processing performance and reliability.
Maintain dataflows from MongoDB to Google BigQuery.
Develop dashboards and reports in Apache Superset to enable data visualization and decision-making based on interactions with other departments.
Ensure data quality, security, and accessibility by collaborating with cross-functional teams.
Requirements:
Bachelor’s or Master’s degree in Computer Science, Data Engineering, Mathematics, Physics or a related field. Equivalent experience in place of a degree is applicable.
Minimum 2 years of experience in data engineering or related fields.
Core Tech Stack:
PostgreSQL, Google BigQuery for database management.
Airbyte, Airflow for data pipeline automation.
Apache Superset for visualization and reporting.
Git based code revision control system.
Cloud Proficiency: Experience with Google Cloud Platform (GCP) for data processing and storage.
Programming Skills:
Advanced SQL for efficient data transformation.
Working knowledge of Python for scripting and automation.
Basic JavaScript for BigQuery Dataform functions where applicable.
Strong problem-solving skills with the ability to work independently.
Fluency in spoken English communication for effective collaboration across teams.