Enable job alerts via email!

Site Reliability Engineer

Experis - ManpowerGroup

England

Remote

GBP 100,000 - 125,000

11 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An innovative firm is seeking a Site Reliability Engineer to enhance the reliability and scalability of client platforms. In this dynamic role, you will lead initiatives focused on observability, implementing key metrics to ensure high performance and reliability. You will work with cutting-edge tools and technologies, architecting resilient cloud infrastructures and collaborating with cross-functional teams to drive continuous improvement. This position offers the opportunity to contribute to business development and personal growth through various initiatives, making it an exciting opportunity for those passionate about technology and client engagement.

Qualifications

  • Experience in implementing SRE principles and observability.
  • Strong understanding of SLIs, SLOs, and error budgets.

Responsibilities

  • Define SLIs and SLOs to maintain system performance.
  • Collaborate with teams to implement automation strategies.

Skills

SRE principles

Observability

Service Level Indicators (SLIs)

Service Level Objectives (SLOs)

Kubernetes

Cloud infrastructure

Continuous delivery pipelines

Collaboration skills

Client interaction

Education

Experience in SRE

Security Check (SC) Clearance

Tools

Dynatrace

Prometheus

OpenTelemetry

Job description

Job Title: Site Reliability Engineer - Digital Factory
Location: 100% Remote
Duration: 3 Months
Rate: £500 per day - Umbrella Only
Clearance: Active SC or Eligibility for SC


Job Description:
As a Site Reliability Engineer (SRE), you will play a key role in ensuring the reliability, scalability, and efficiency of our clients' platforms. Your focus will include building strong observability practices, aligning with the SRE mindset & principles, and driving continuous improvement.


This will involve:

  1. Defining and implementing Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to measure and maintain system and application performance, ensuring services meet agreed reliability targets.
  2. Instrumenting applications to collect key metrics, logs, and traces that enable proactive monitoring and troubleshooting.
  3. Creating dashboards and configuring alerts to provide real-time visibility into system health, enabling teams to quickly detect and resolve issues.
  4. Assessing and enhancing Kubernetes capabilities, improving DevOps efficiency through innovation, agility and cost optimisation.
  5. Taking a holistic approach to modernising the developer experience, focusing on organisational culture, DevOps practices, processes, automation and tooling.
  6. Architecting scalable and resilient cloud infrastructure to ensure the seamless deployment and optimisation of containerised applications.
  7. Collaborating with cross-functional teams to implement automation strategies that reduce operational complexity and drive continuous improvement.
  8. Roles can involve out-of-hours or on-call support, depending on client requirements.

Key expectations from this role include:
As a Consultant: Lead site reliability engineering initiatives with a strong emphasis on observability, ensuring high performance and reliability of applications & infrastructure. Provide strategic insights to shape the overall SRE strategy while collaborating on the design and implementation of scalable and reliable solutions. Establish effective monitoring, alerting and incident response strategies to maintain system availability and promote continuous improvement by collaborating with team members to deliver observability best practices and SRE methodologies.


As part of your role you will also have the opportunity to contribute to the business and your own personal growth, through activities that form part of the following categories:

  1. Business Development - Leading/contributing to proposals, RFPs, bids, proposition development, client pitch contribution, client hosting at events.
  2. Internal contribution - Campaign development, internal think-tanks, whitepapers, practice development (operations, recruitment, team events & activities), offering development.
  3. Learning & development - Training to support your career development and the skills demand within the company, certifications etc.

Your Profile:
We are looking for someone with experience in implementing SRE principles, with a focus on observability and optimising applications & cloud environments. You will be comfortable working in a dynamic, technology-driven environment, while bringing proven expertise in the following areas:

  1. Strong understanding of the SRE mindset and principles, including the creation and management of Service Level Indicators (SLIs), Service Level Objectives (SLOs) and error budgets ensuring reliability and performance.
  2. Experience in implementing observability, instrumenting applications to provide insights into system performance. Hands-on experience with tools such as Dynatrace, Prometheus and OpenTelemetry for monitoring, tracing, and real-time alerting is highly sought after.
  3. An understanding of microservices and container orchestration with the ability to optimise containerised applications for reliability and scalability.
  4. Experience enabling continuous delivery pipelines, with a focus on ensuring system reliability, quality, and performance through automated deployment, scaling, and observability tools.
  5. Understanding of build and deployment of pipelines and experience in collaborating with developers to improve observability and monitoring practices.
  6. Strong collaboration skills with the ability to work effectively both independently and as part of a team.
  7. Comfortability interacting and engaging with clients, although a consulting background is not a prerequisite.
  8. An enthusiasm and excitement at the prospect of working with a wide range of technology stacks and cloud providers across the wide range of clients and industries we support.
  9. You must have SC (Security Check) Clearance, or be eligible and willing to gain this level of clearance.
  10. You must be able to work Out of Hours or On Call should this be needed for the role you are on.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Site Reliability Engineer - Contract

Only for registered members

Crawley

Remote

GBP 100,000 - 125,000

14 days ago

Site Reliability Engineer IOE: Cardano

Only for registered members

Remote

GBP 100,000 - 125,000

7 days ago
Be an early applicant

Principal Safety Engineer - NW England - Energy

Only for registered members

Warrington

On-site

GBP 100,000 - 125,000

Yesterday
Be an early applicant

Site Reliability Engineer

Only for registered members

Birmingham

On-site

GBP 100,000 - 125,000

12 days ago

Site Reliability Engineer

Only for registered members

Stoke-on-Trent

Hybrid

GBP 100,000 - 125,000

28 days ago

Site Reliability Engineer

Only for registered members

Remote

GBP 100,000 - 125,000

30 days ago

Senior Site Reliability Engineer - EMEA

Only for registered members

Greater London

Remote

GBP 100,000 - 125,000

30 days ago

Remote Site Reliability Engineer

Only for registered members

Remote

GBP 100,000 - 125,000

30 days ago

Site Reliability Engineer, Compute

Only for registered members

Remote

GBP 100,000 - 125,000

30+ days ago