Experience: 5-8 Years Summary: Experienced Informatica PC and IDMC Developer with 5+ years in data engineering. Skilled in designing end-to-end data pipelines using PC/IDMC for effective migration and transformation across cloud platforms. The ideal candidate will have in-depth knowledge of Informatica PowerCenter and IDMC. A strong foundation in Data Warehousing concepts and proficiency in Snowflake and SQL is essential. Strong team player with agile experience, delivering timely, high-impact data solutions. Technical Skills
Tools: Informatica Cloud Data Integration, Informatica PowerCenter
Data Warehousing: Snowflake, DataLake
Programming: SQL, Python, Shell Scripting
Data Management: Storage management, quality monitoring, governance
Design, develop, and optimize ETL workflows using Informatica PowerCenter (PC) and Informatica IDMC.
Manage data ingestion processes from diverse data sources such as Salesforce, Oracle databases, PostgreSQL, and MySQL.
Implement and maintain ETL processes and data pipelines to ensure efficient data extraction, transformation, and loading.
Utilize Snowflake as the data warehouse solution for managing large volumes of structured and unstructured data.
Maintain and optimize ETL jobs for performance and reliability, ensuring timely data availability for business users.
Support data migration, data integration, and data consolidation efforts.
Write and maintain basic Python scripts for data processing and automation tasks.
Utilize Unix shell commands for data-related tasks and system management.
Troubleshoot and resolve ETL-related issues, ensuring data integrity and availability.
Ensure adherence to best practices for data governance and security.
Professional Experience
Informatica Developer
Developed ELT processes using PC/IDMC to integrate data into Snowflake.
Implemented storage management for Azure Blob and Snowflake, enhancing data security.
Worked on basic Python and shell scripting languages.
Desired candidate profile
ETL Process Design and Development:
ETL Pipeline Creation: Designing and developing ETL (Extract, Transform, Load) workflows using Informatica PowerCenter, Informatica Cloud Data Integration, or Informatica Intelligent Cloud Services (IICS).
Data Transformation: Creating transformations to cleanse, aggregate, and manipulate data as it moves through the ETL pipeline to ensure data quality and consistency.
Data Loading: Ensuring that the transformed data is loaded into appropriate destinations such as data warehouses, data lakes, or databases.
Data Integration:
Source System Integration: Integrating various data sources, such as relational databases, flat files, APIs, cloud platforms, and on-premise systems.
Data Migration: Moving data from legacy systems to modern data architectures using Informatica’s integration capabilities.
Real-time Data Integration: Implementing real-time data integration using Informatica’s features such as Change Data Capture (CDC) or Informatica Data Streaming for near-instantaneous updates.
Data Quality and Cleansing:
Data Profiling: Analyzing and profiling source data to identify quality issues, missing values, and inconsistencies that need to be addressed before loading into the destination system.
Data Transformation and Cleansing: Ensuring that data is transformed and cleaned according to business rules, such as removing duplicates, standardizing formats, and applying business logic.
Data Validation: Validating that data in the destination is accurate and consistent with the source data and business requirements.
Performance Optimization:
Query Optimization: Optimizing ETL jobs for performance, ensuring that the processes are running efficiently and within required time limits.
Parallel Processing: Implementing parallel processing in Informatica to improve performance during large-scale data transfers and transformations.
Memory Management: Ensuring proper memory usage and optimizing workflows to minimize bottlenecks in the ETL process.
Data Architecture and Design:
Data Modeling: Designing and developing logical and physical data models for efficient data storage, ensuring optimal schema design in data warehouses or data lakes.
Data Lineage: Implementing data lineage tracking in the ETL pipelines to ensure transparency and traceability of data flow from sources to destinations.
Collaboration with Stakeholders:
Working with Data Scientists and Analysts: Collaborating with data scientists, business analysts, and other stakeholders to ensure that the right data is extracted and transformed to support analytical use cases.
Requirements Gathering: Gathering requirements from business users and technical teams to ensure the ETL process meets their needs and supports various reporting or analytics initiatives.
Troubleshooting and Support:
Debugging ETL Jobs: Identifying and resolving issues with ETL jobs, such as data mismatches, failures, or performance problems.
Error Handling and Logging: Implementing robust error handling mechanisms in ETL processes to ensure smooth operations and quick recovery from failures.
System Monitoring: Monitoring the performance and health of ETL jobs and making adjustments as necessary.