Press Tab to Move to Skip to Content Link
AVP-Data Integration Engineering
Location:
ID
Level: Managerial
Employment Status: Permanent
Department: Group Digital Engineering and Transformation
Role Purpose
Role & Responsibilities: Data Warehouse and Business Intelligence Engineering
To LEAD, OVERSEE, and GUIDE Data Integration, ETL, and Data Pipeline Engineering activities for end-to-end business solutions, ensuring high-performance, scalable, and reliable data movement across on-premise, cloud, and hybrid architectures using batch, API, Streaming, or microservices. This role plays a critical role in automating, optimizing, and modernizing data integration workflows while ensuring data quality, governance, and observability.
Strategic Leadership & Governance
- Enterprise Data Integration Strategy: Drive end-to-end data pipeline architecture across batch, real-time streaming, API-based, and cloud-native integrations.
- Multi-Cloud & Hybrid Data Architecture: Design scalable, flexible, and fault-tolerant data integration strategies spanning on-prem, Hadoop, and GCP (BigQuery, Cloud Storage, Pub/Sub, Dataflow, Dataproc).
- Vendor & Stakeholder Management: Collaborate with Data Engineers, BI Developers, Cloud Engineers, and Vendor Partners to ensure SLA compliance and optimal data flow management.
Big Data, Hadoop & NoSQL Integration
- Hadoop Ecosystem Mastery: Deep expertise in HDFS, Hive, Spark, Impala, HBase, Kafka, Oozie, and Sqoop.
- Optimized Data Processing: Implement distributed computing models for massive-scale ETL & analytics workloads.
- Data Lake & Datalakehouse Optimization: Architect data ingestion pipelines for structured, semi-structured, and unstructured data into Delta Lake, Iceberg, or BigQuery.
API-Based Data Integration
- Microservices & API Integration: Develop high-performance API-based ETL solutions using REST, gRPC, GraphQL, and WebSockets for real-time data exchange.
- HBase & NoSQL API Integration: Enable low-latency API access to HBase, Cassandra, and DynamoDB for high-throughput operational analytics.
- Data Federation & Virtualization: Implement Federated Queries and Data Virtualization for seamless cross-platform data access.
Real-Time Streaming & Event-Driven Architecture
- Enterprise Streaming Pipelines: Design & optimize Kafka, Flink, Spark Streaming, and Pub/Sub for real-time data ingestion and transformation.
- Event-Driven ETL Pipelines: Enable Change Data Capture (CDC) and event-based data processing for real-time decision-making.
- Kafka Integration: Develop high-throughput, scalable Kafka pipelines with Kafka Connect, Schema Registry, and KSQL.
- HBase Streaming: Leverage HBase + Kafka for low-latency, high-volume event ingestion & querying.
Cloud Data Engineering & GCP Capabilities
- BigQuery Optimization: Leverage partitioning, clustering, and materialized views for cost-effective and high-speed queries.
- ETL & Orchestration: Develop robust ETL/ELT pipelines using Cloud Data Fusion, Apache Beam, Dataflow, and Airflow.
- Hybrid Cloud & On-Prem Integration: Seamlessly integrate Hadoop-based Big Data systems with GCP, on-premises databases, and legacy BI tools.
BI DevOps, Automation & Innovation
- BI DevOps & Continuous Delivery: Implement CI/CD pipelines to accelerate BI feature releases, ETL deployments, and dashboard updates.
- Data Observability & Quality Monitoring: Ensure end-to-end monitoring of data pipelines, anomaly detection, and real-time alerting.
- AI/ML Integration for BI: Apply predictive analytics and AI-driven insights to enhance business intelligence and reporting.
- Bottleneck Identification & Resolution: Proactively identify and eliminate performance issues in Hadoop clusters, ETL pipelines, and BI reporting layers.
Minimum Requirements
Qualification:
Minimum University Degree (S1), Preferable Study area in Information Technology, Computer, Electrical, Telecommunication, or Mathematics/Statistics.
Experience:
At least 5 years experience in full cycle process in Data Integration, Microservices, and Data warehouse. Preferably has experience in the telecommunication industry. Experience managing a team is an advantage.
Skills:
- Very good analytical thinking and problem solving for effective identification of business problems, understanding of stakeholder’s needs and assessment and formulation of the solution.
- Very good communication in Indonesian and English.
- Very good skills in technical writing and reporting.
- Very good presentation and persuasion skill capabilities.
- Very good collaboration skills with many stakeholders.
- Very good knowledge in Data Warehousing, Big Data and BI architecture, technology, design, development and operation.
- Good knowledge of telecommunication business in general.
- Experience and knowledge to process CDR from Telco’s system e.g. Charging and Billing, GGSN, MSC, VLR, SMSC, etc.
- Experience handling Data Integration project teams and developer teams for a minimum of 5 years.
- Experience working with near real-time data, huge data volumes, and unstructured data processing.
- Familiar and hands-on with the following technology stack:
- Programming: Python, Java, Scala, Go, Shell script, SQL (PL/pgSQL, T-SQL, BigQuery SQL), or other relevant scripting.
- Data Pipeline Orchestration: Apache Airflow, Cloud Composer, NiFi, etc.
- Big Data & Streaming: Kafka, Flink, Spark Streaming, HBase, Hive, Impala, Presto.
- Cloud Data Engineering: GCP (BigQuery, Dataflow, Dataproc, Pub/Sub, Cloud Storage).
- Monitoring & Observability: ELK Stack (Elasticsearch, Logstash, Kibana), Datadog, Prometheus, Grafana.
- Microservices & API Integration: REST, gRPC, GraphQL, WebSockets, OpenTelemetry.
- Data Governance & Quality: Great Expectations, dbt, Dataform, Monte Carlo.
- BI DevOps & Automation: Terraform, Kubernetes, GitOps, Cloud Build.
- Good knowledge in IT infrastructure in the areas of Server, Storage, Database, Backup System, Desktop, Local/Wide Area Network, Data Center and Disaster Recovery.
- Good knowledge in Agile Development (Scrum), Business Process Framework (eTOM), Application Framework (TAM) and Information Framework (SID).