Enable job alerts via email!

Senior Site Reliability Engineer

ZipRecruiter

Manchester

On-site

GBP 50,000 - 90,000

4 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An established industry player is seeking a Site Reliability Engineer to help shape the future of DevOps. In this pivotal role, you will ensure the availability and performance of critical infrastructure running in AWS. You will collaborate with various teams to implement and maintain scalable systems while adhering to best practices. This position offers a unique opportunity to work with cutting-edge technologies and contribute to a culture of continuous improvement. Join a diverse and inclusive team that values integrity and innovation, and play a key role in driving the success of the organization.

Benefits

Health care coverage

Generous time off

Access to learning resources

Retirement planning

Family-friendly benefits

Retail discounts

Referral incentive awards

Qualifications

  • 5+ years of experience as a Site Reliability Engineer or equivalent role.
  • Proficient in AWS and observability tools like Splunk OpenTelemetry.
  • Strong troubleshooting skills and familiarity with Infrastructure as Code.

Responsibilities

  • Design and maintain observability solutions for system health.
  • Collaborate with teams to enhance system resilience and reduce downtime.
  • Analyze performance metrics to improve system responsiveness.

Skills

Application and infrastructure observability

Troubleshooting and problem-solving

Infrastructure as Code

CI/CD pipelines

Monitoring large-scale distributed systems

Scripting and automation

Agile methodology

Site Reliability Engineering

Education

Bachelor's degree in Computer Science

Tools

AWS

Terraform

Splunk OpenTelemetry

Docker

Kubernetes

Powershell

Bash

Python

.NET C#

Job description

Job Description

This job is with S&P Global, an inclusive employer and a member of myGwork – the largest global platform for the + business community. Please do not contact the recruiter directly.

About the Role: Grade Level (for internal use): 11

The Team: This position offers significant potential to help shape the future direction of DevOps and your own career. The team is responsible for ensuring the availability, latency, performance, efficiency, and stability of our critical infrastructure, all of which runs in AWS. You will collaborate closely with multiple stakeholders including development teams to implement and maintain reliable and scalable systems while adhering to industry best practices and security standards.

Responsibilities and Impact:

  1. Design, implement, and maintain observability solutions to track system health and performance.
  2. Analyze observability data to identify and troubleshoot potential issues proactively.
  3. Develop and implement alerts and notifications for critical events.
  4. Collaborate with development teams to enhance system resilience and reduce downtime.
  5. Analyze and optimize performance metrics to resolve latency bottlenecks and improve system responsiveness.
  6. Develop and maintain metrics dashboards to track key performance indicators (KPIs).
  7. Design and implement automated deployment and rollback procedures to mitigate risks.
  8. Analyze root causes of incidents and implement preventive measures to minimize recurrence.

What We're Looking For:

Basic Required Qualifications:

  1. Bachelor's degree in Computer Science, Information Technology, or a related field.
  2. 5+ years of experience as a Site Reliability Engineer or equivalent in a similar role.
  3. Proficient in application and infrastructure observability, Splunk OpenTelemetry.
  4. Experienced in production environments running in AWS.
  5. Comfortable with Infrastructure as Code, Terraform.
  6. Comfortable with CI/CD pipelines such as GitHub Actions, Azure DevOps.
  7. Excellent troubleshooting and problem-solving skills with a knack for identifying and resolving complex technical issues.
  8. Familiarity working in an Agile environment.
  9. True understanding of Site Reliability Engineering.
  10. Ability to build and maintain a system and culture that supports and implements SLOs.
  11. Familiar with Docker & Kubernetes, specifically EKS & ECS.
  12. Familiar with programming, such as Python or .NET C#.

Additional Qualifications:

  1. Proven experience in monitoring, analyzing, and optimizing the performance of large-scale distributed systems in a cloud environment.
  2. Proven experience with Windows or Linux production environments, including managing servers, operating systems, and network configurations within the cloud.
  3. Proven scripting and automation skills, preferably Powershell, Bash or Python.
  4. AWS certification.
  5. Ability to work independently and as part of a collaborative team, effectively communicating technical concepts to both technical and non-technical stakeholders.

About S&P Global Market Intelligence: At S&P Global Market Intelligence, a division of S&P Global, we understand the importance of accurate, deep and insightful information. Our team of experts delivers unrivaled insights and leading data and technology solutions, partnering with customers to expand their perspective, operate with confidence, and make decisions with conviction.

What's In It For You?

Our Purpose: Progress is not a self-starter. It requires a catalyst to be set in motion. Information, imagination, people, technology - the right combination can unlock possibility and change the world. Our world is in transition and getting more complex by the day. We push past expected observations and seek out new levels of understanding so that we can help companies, governments and individuals make an impact on tomorrow.

Our People: We're more than 35,000 strong worldwide - so we're able to understand nuances while having a broad perspective. Our team is driven by curiosity and a shared belief that Essential Intelligence can help build a more prosperous future for us all.

Our Values: Integrity, Discovery, Partnership.

Benefits: We take care of you, so you can take care of business. We care about our people. That's why we provide everything you - and your career - need to thrive at S&P Global. Our benefits include:

  1. Health & Wellness: Health care coverage designed for the mind and body.
  2. Flexible Downtime: Generous time off helps keep you energized for your time on.
  3. Continuous Learning: Access a wealth of resources to grow your career and learn valuable new skills.
  4. Invest in Your Future: Secure your financial future through competitive pay, retirement planning, a continuing education program with a company-matched student loan contribution, and financial wellness programs.
  5. Family Friendly Perks: It's not just about you. S&P Global has perks for your partners and little ones, too, with some best-in-class benefits for families.
  6. Beyond the Basics: From retail discounts to referral incentive awards - small perks can make a big difference.

Inclusive Hiring and Opportunity at S&P Global: At S&P Global, we are committed to fostering an inclusive workplace where all individuals have access to opportunities based on their skills, experience, and contributions. Our hiring practices emphasize fairness, transparency, and equal opportunity, ensuring that we attract and retain top talent.

Equal Opportunity Employer: S&P Global is an equal opportunity employer and all qualified candidates will receive consideration for employment without regard to marital status, military veteran status, unemployment status, or any other status protected by law.

US Candidates Only: The EEO is the Law Poster describes discrimination protections under federal law. Pay Transparency Nondiscrimination Provision.

Job ID: 310750 Posted On: 2025-02-25 Location: Manchester, Manchester, United Kingdom #LI-DNI

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Senior Site Reliability Engineer

Only for registered members

Remote

GBP 50,000 - 80,000

Today
Be an early applicant

Senior Site Reliability Engineer

Only for registered members

Manchester

Hybrid

GBP 45,000 - 75,000

14 days ago

Senior Site Reliability Engineer | London, UK

Only for registered members

London

Remote

GBP 50,000 - 90,000

10 days ago

Senior Safety Engineer

Only for registered members

Liverpool City Region

On-site

GBP 68,000 - 69,000

Today
Be an early applicant

Senior Site Reliability Engineer

Only for registered members

England

Remote

GBP 65,000 - 80,000

27 days ago

Senior Site Reliability Engineer

Only for registered members

Knutsford

On-site

GBP 60,000 - 100,000

7 days ago
Be an early applicant

Senior Site Reliability Engineer

Only for registered members

Remote

GBP 80,000 - 100,000

26 days ago

Senior Site Reliability Engineer - BeOne

Only for registered members

Warwick

Remote

GBP 50,000 - 90,000

17 days ago

Sr. Site Reliability Engineer

Only for registered members

Remote

GBP 40,000 - 80,000

20 days ago