Site Reliability Engineer (SRE) / DevOps

Trimble

Mexicali

MXN 400,000 - 600,000

Descripción del empleo

Site Reliability Engineer (SRE) / DevOps page is loaded

Site Reliability Engineer (SRE) / DevOps

Apply locations Mexico - Mexicali time type Full time posted on Posted 5 Days Ago job requisition id R45648

Your Title : Site Reliability Engineer (SRE) / DevOps

Location : Mexicali, Mexico

We are seeking a skilled and motivated Site Reliability Engineer to join our team in Trimble’s Core Cloud Platform. The ideal candidate will have a strong background in cloud platforms, infrastructure as code, and automation via programming/scripting languages. You will embed with a product delivery team to drive the reliability, scalability, and security of the team’s services and infrastructure.

The Core Cloud Platform group builds the foundational common services used by dozens of Trimble products and millions of users. The services we provide include identity/authentication, batch data processing, API management, and enterprise data systems.

What You Will Do

Develop and maintain infrastructure as code (IaC) using Terraform to ensure reliable and scalable cloud environments.
Perform code deployments and manage CI/CD pipelines using Jenkins, Github and related tooling.
Automate routine tasks and workflows to increase operational efficiency and reduce manual intervention.
Evaluate system designs and architectures for reliability, performance, security, and efficiency.
Lead incident response efforts, conduct root cause analysis, and implement long-term solutions for complex issues.
Develop and maintain documentation including but not limited to architecture diagrams, service descriptions, build and deploy documentation and operations run book documentation
Continuously improve documentation for systems and services, contributing to a knowledge-sharing culture within the team.
Embed within a product delivery team to provide expertise on cloud systems design, software deployment practices, infrastructure as code, and system observability.
Participate in on-call rotation for incident escalations.

What Skills & Experience You Should Bring

Demonstrates a solid understanding of SRE principles, including service reliability, observability, and risk management.
Proficiency in implementing system observability and alerting using tools like Datadog and Sumologic.
Builds cloud infrastructure using infrastructure as code tools (Terraform or Cloudformation).
Expertise in building and running systems on a public cloud platform (AWS and/or Azure).
Experience with a programming language (like Python, Java, Go, Javascript, etc.) and scripting (Bash or Powershell).
Experience with the management of Linux and/or Windows servers.
Effective verbal and written communication skills, with the ability to effectively collaborate with team members.

Education and Experience

Bachelor's degree in Computer Science, Information Technology, or a related field, or equivalent experience.
3+ years of relevant experience in Site Reliability Engineering or a similar role.