Our team is seeking a highly motivated and skilled Cloud SRE to join our dynamic team. The ideal candidate will have experience in designing, implementing, and maintaining Workload Automation infrastructure for enterprise-level organizations. The Cloud SRE will be responsible for ensuring the reliability, performance, monitoring, security, and overall management of the Workload Automation platform using On-Premise tools and Azure Batch. You will work closely with application and data operation teams to design, implement, and maintain the company’s file transfer platform to facilitate seamless and secure data exchange between internal and external systems. These integration services within our Integration Platform are used by application support teams to develop applications that support batch processing and file transfers. The availability of these integration services is key to providing our critical business functions. Reporting to the Integration Platform Manager.
Additionally, this individual will be expected to upskill in other components of the Azure Integration Services as well as contribute to any planned modernization from On-Premise to Cloud.
Primary Responsibilities
Support and maintenance of business-critical apps.
Design and maintain our file transfer services to ensure timely transfer of data between internal and external sources.
Design and maintain batch processing services to ensure timely and accurate processing of critical batch jobs.
Provision, configure and manage Azure VMs and Azure Batch pools.
Collaborate with the Cloud Engineering & cross-functional teams to implement scalable and highly automated solutions and develop strategies for optimal file transfer processes.
Forecast resource needs based on historical data and future projects.
Design and implement secure, efficient, and reliable file transfer solutions using various technologies such as SFTP, FTPS, AS2, etc.
Develop automation processes using scripting and automation tools for managing file transfers including error handling and monitoring while maintaining alert and notification systems to ensure failures are detected and proper teams are notified with service level objectives aligned with business goals.
Develop and maintain CI/CD pipelines for File transfer infrastructure.
Troubleshoot complex issues related to jobs and transfers by analyzing logs, network traffic, and application configurations.
Set up security protocols and encryption measures to protect sensitive data during transit.
Proactively identify areas of improvement in the existing infrastructure and provide recommendations to enhance performance, security, or reliability.
Work with application support teams to ensure application and batch processes are compliant and running efficiently.
Be part of the technical transformation team that will deliver the migration from On-Prem to Cloud services, including taking part in design stages, testing, and operations.
Create proper documentation and other artifacts as required to assist in the knowledge transfer to other cloud operations team members.
Conduct regular audits on the file transfer platform to identify potential vulnerabilities or areas for improvement.
Stay updated with industry advancements in data transmission technologies and incorporate them into the Integration platform as needed.
Qualifications
Candidate must have:
Strong experience as a Platform/Systems Engineer with at least 1 year of hands-on experience with file transfer solutions.
Experience of Azure infrastructure, Storage, and Azure Batch.
Strong understanding of TCP/IP networking concepts and protocols like FTPS, SSH/SFTP, TLS.
Experience with logging and monitoring tools to ensure services are available and operating as expected.
Experience working in a highly automated environment and/or in financial services.
Experience supporting batch environments that operate 24×7 services.
Strong knowledge of Microsoft Windows Servers.
Experience supporting file transfer environments that operate 24×7 services.
Strong experience in building fault-tolerant, scalable, and secure systems.
Ability to learn new ways and technologies.
Strong oral and written communication skills.
Production systems support experience.
Troubleshooting skills and proven ability to carry out detailed root cause analysis.
A continuous learning mindset.
Beneficial
Experience with Containerization.
Experience with scripting languages Bash, Python, or PowerShell and automation tools (Azure Automation, Azure DevOps).
Experience with Progress MOVEit technologies.
Experience with IBM Workload Scheduler (IWS).
Knowledge or awareness of C# and .NET.
Venquis is acting as an Employment Agency in relation to this vacancy.