We are an international technology services company founded in 1983 and currently have over 2,000 employees in 5 countries: France, Spain, Romania, Portugal, and Luxembourg!
What are we looking for?
A Data Engineer to join a stable international project, based in Madrid.
Responsibilities :
Contribute to the production support and maintenance of the DataHub, correct incidents and anomalies, resolve data-related issues, and implement functional and technical developments to ensure the stability of production processes and ensure the DataHub is available to users with minimal disruption during agreed uptime requirements.
Modify existing code according to business requirements and continuously improve it for improved performance and maintainability.
Ensure the performance and security of the data infrastructure and follow data engineering best practices.
Data modeling and development of efficient data pipelines to enrich and transform large volumes of data with complex business rules, automate data pipelines, and optimize data ingestion. Design and implement scalable and secure data processing pipelines using Scala, Spark, cloud object storage, Hadoop, and COS.
Integrate data from multiple sources and formats into the raw layer of the DataHub.
Implement data transformation and quality to ensure data consistency and accuracy. Use programming languages such as Scala and SQL and tools such as Spark for data transformation and enrichment operations.
Configure CI / CD pipelines to automate deployments, unit testing, and development management.
Write and perform unit and validation tests to ensure the accuracy and integrity of the developed code.
Implement various orchestrators and scheduling processes to automate data pipeline execution (Airflow as a Service).
Migrate existing Hadoop infrastructure to cloud infrastructure on Kubernetes Engine, COS, Spark as a Service, and Airflow as a Service.
Write technical documentation (specifications, operational documents) to ensure knowledge capitalization.
Requirements :
Spark en Scala
CI / CD (Gitlab, Jenkins…)
HDFS and structured databases (SQL)
Apache Airflow
Streaming process (Kafka, event stream…)
Storage / COS S3
Shell Script
Kubernetes
Elasticsearch and Kibana
HVault
Dremio as tool to virtualize data
Dataiku
English spoken B2
Valuable :
Elasticsearch and Kibana
Streaming process (Kafka, Event Stream…)
Vault
Dataiku
Work Model :
Hybrid.
Flexible hours, Monday to Friday.
We offer :
Continuous training
Career plan tailored to employee preferences
Progression within the company
Flexible working hours
Hybrid work model
Language training (English, French, Spanish).
Salary : €45,000
Would you like to join our team?
If you have experience in data and are looking to grow technically and professionally, don't hesitate to apply for this position. Contact us!
Obtenga la revisión gratuita y confidencial de su currículum.