Senior Data Engineer

Aculocity, LLC

Durbanville

ZAR 200 000 - 300 000

Job description

Aculocity Data Team: Senior Data Engineer, Job description

Premise:

The Senior Data Engineer is to be a lead resource in building and developing the Aculocity Data capabilities.

As Senior Data Engineer, you will play a pivotal role in our agile Data Engineering team, leading the design, development, and optimization of complex data infrastructure and pipelines. Your expertise will be crucial in delivering high-impact reporting, analytics, and machine learning solutions that drive business success. You will leverage your extensive experience to not only build and maintain critical data systems but also to mentor and guide more junior engineers, ensuring the seamless translation of intricate business requirements into robust and scalable data solutions.

Job Title: Senior Data Engineer

Reports To: Data Engineering Manager

Data Engineer Job Responsibilities:

Serve as the subject matter expert for data and systems.
Develop and maintain scalable data pipelines and build out new API integrations to support continuing increases in data volume and complexity.
Collaborate with analytics and business teams to improve data models that feed business intelligence tools, increasing data accessibility and fostering data-driven decision making across the organization.
Implement processes and systems to monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it.
Develop end-to-end ML pipelines encompassing the ML lifecycle from data ingestion, data transformation, model training, model validation, model serving, and model evaluation over time.
Collaborate closely with AI scientists to accelerate deployments of ML algorithms to production.
Setup CI/CD/CT pipelines, model repository for ML algorithms.
Deploy models as a service.
Contribute to engineering wiki, and document work.
Perform data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
Work closely with cross-functional teams of frontend and backend engineers, product managers, and analysts to enhance data models and support advanced BI and analytics.
Define company data assets (data models), ETL jobs to populate data models.
Design data integrations and data quality framework.
Design and evaluate open source and vendor tools for data lineage.
Work closely with all business units and engineering teams to develop strategy for long term data platform architecture.
Mentor junior data engineers, lead code reviews, and promote best practices and skill development.

Data Engineer Qualifications / Skills:

Knowledge of best practices and IT operations in an always-up, always-available service.
Establish and promote best practices for data pipeline and model development.
Experience with or knowledge of Agile Software Development methodologies.
Excellent problem solving, creativity, attention to detail and troubleshooting skills.
Process oriented with great documentation skills.
Excellent oral and written communication skills with a keen sense of customer service.
Eagerness to learn and upskill to stay at the forefront of technology offerings in the market.
Understanding of ML Algorithms, experience creating & executing efficient MLOps pipelines, and tuning ML models.
Team player mindset with an enthusiasm for collaboration.

Education, Experience, and Licensing Requirements:

Must Have:

BSc or MSc degree in Computer Science or a related technical field.
5+ years of Python or R development experience.
5+ years of MS SQL experience (PostgreSQL experience is a plus).
5+ years of experience with Warehouse Architecture, schema design and dimensional data modeling.
5+ years of experience in Data Analytics and Business Intelligence tools such as Power BI.
Ability in managing and communicating data warehouse plans to internal clients.
Experience designing, building, and maintaining data processing systems on multiple platforms both Cloud (Azure, AWS, MS Fabric is a plus) and On-Premises (MS SQL Server, SSIS).
Experience in ML Model deployment, ML frameworks and libraries.
Good experience in Apache Spark.
Experience debugging and reasoning about production issues is desirable.
Experience presenting demos and training of technical, non-technical and analytical resources.

Advantages to have:

Experience in data streaming is advantageous i.e. Kafka and/or AWS Kinesis.
IoT device and systems integration.