Site Reliability Engineer (SRE) / DevOps page is loaded
Your Title : Site Reliability Engineer (SRE) / DevOps
Location : Mexicali, Mexico
We are seeking a skilled and motivated Site Reliability Engineer to join our team in Trimble’s Core Cloud Platform. The ideal candidate will have a strong background in cloud platforms, infrastructure as code, and automation via programming/scripting languages. You will embed with a product delivery team to drive the reliability, scalability, and security of the team’s services and infrastructure.
The Core Cloud Platform group builds the foundational common services used by dozens of Trimble products and millions of users. The services we provide include identity/authentication, batch data processing, API management, and enterprise data systems.
What You Will Do
Develop and maintain infrastructure as code (IaC) using Terraform to ensure reliable and scalable cloud environments.
Perform code deployments and manage CI/CD pipelines using Jenkins, Github and related tooling.
Automate routine tasks and workflows to increase operational efficiency and reduce manual intervention.
Evaluate system designs and architectures for reliability, performance, security, and efficiency.
Lead incident response efforts, conduct root cause analysis, and implement long-term solutions for complex issues.
Develop and maintain documentation including but not limited to architecture diagrams, service descriptions, build and deploy documentation and operations run book documentation
Continuously improve documentation for systems and services, contributing to a knowledge-sharing culture within the team.
Embed within a product delivery team to provide expertise on cloud systems design, software deployment practices, infrastructure as code, and system observability.
Participate in on-call rotation for incident escalations.
What Skills & Experience You Should Bring
Demonstrates a solid understanding of SRE principles, including service reliability, observability, and risk management.
Proficiency in implementing system observability and alerting using tools like Datadog and Sumologic.
Builds cloud infrastructure using infrastructure as code tools (Terraform or Cloudformation).
Expertise in building and running systems on a public cloud platform (AWS and/or Azure).
Experience with a programming language (like Python, Java, Go, Javascript, etc.) and scripting (Bash or Powershell).
Experience with the management of Linux and/or Windows servers.
Effective verbal and written communication skills, with the ability to effectively collaborate with team members.
Education and Experience