Enable job alerts via email!

Site Reliability Engineer Senior Lead

TN United Kingdom

Slough

On-site

GBP 125,000 - 150,000

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An established industry player is seeking a Senior Lead in Site Reliability Engineering to oversee the reliability and performance of critical systems. This pivotal role involves strategizing automation, enhancing observability, and transforming IT operations to align with business objectives. With a focus on building highly available systems and implementing best practices, you will lead initiatives that drive efficiency and reliability across the organization. Join a purpose-driven company where you can make a significant impact while working with a diverse team of over 140,000 talented associates. If you are passionate about technology and leadership, this opportunity is perfect for you.

Benefits

Best-in-class learning and development support

Access to Mars University

Industry competitive salary

Company bonus

Qualifications

  • 7+ years of IT experience with 3+ years in SRE or DevOps roles.
  • Deep understanding of SRE principles and cloud technologies.

Responsibilities

  • Design and maintain scalable systems while ensuring reliability.
  • Automate tasks and implement Infrastructure as Code (IaC).
  • Collaborate with teams to enhance system performance and reliability.

Skills

Site Reliability Engineering (SRE)

DevOps best practices

Analytical skills

Interpersonal skills

Organizational skills

Problem management

Communication skills

Education

Bachelor’s degree in Information Technology

Bachelor’s degree in Computer Science

Bachelor’s degree in Business Management

Tools

Terraform

Ansible

CI/CD pipelines

Monitoring and observability tools

Cloud platforms

Job description

Social network you want to login/join with:

Site Reliability Engineer Senior Lead, Slough

Client: Mars

Location: Slough, United Kingdom

Job Category: -

EU work permit required: Yes

Job Reference: 202520eca4d8

Job Views: 75

Posted: 13.02.2025

Expiry Date: 30.03.2025

Job Description:

The Systems Reliability Engineering (SRE) Senior Lead is a pivotal leader within our organization, responsible for ensuring the reliability, performance, and scalability of our critical systems. This role is instrumental in strategizing and overseeing reliability with an end-to-end service delivery perspective, aligning technical infrastructure with business objectives to meet evolving customer needs. As an influential figure in our company, the Systems Reliability Engineering Senior Lead will spearhead initiatives to automate infrastructure, enhance system observability, and drive the transformation of our IT operations.

What are we looking for?
  1. Bachelor’s degree in Information Technology, Computer Science, Business Management, or a related field
  2. 7+ years of experience in IT departments or a relevant field
  3. 3+ years in a leadership, SRE, DevOps, or systems engineering role.
  4. A seasoned professional with a deep understanding of Site Reliability Engineering (SRE) principles, DevOps best practices, and cutting-edge technologies.
  5. Strong analytical, interpersonal, and organizational skills with a proven track record in issue and problem management in a multicultural and global environment.
  6. Proficiency with cloud platforms and experience in configuration management, scripting, and monitoring and observability tools.
  7. Understanding of business processes, change management, and ITSM processes, including service level management and reporting.
  8. Excellent communication skills and the ability to work collaboratively with cross-functional teams.
What will be your key responsibilities?
  1. Ensure that the technology stack being deployed is supported according to business requirements, focusing on the infra tech stack and IT Operations support model:
  2. Design, implement, and maintain highly available and scalable systems.
  3. Monitor system performance, reliability, and security using advanced monitoring and logging tools.
  4. Proactively identify and resolve issues that could impact service availability.
  5. Conduct assessments to ensure systems comply with market standards and best practices.
  6. Develop and maintain automated CI/CD pipelines to streamline deployments.
  7. Implement Infrastructure as Code (IaC) using tools like Terraform, Ansible, or others.
  8. Automate repetitive tasks to increase system efficiency and reliability.
  9. Collaborate with software development teams to ensure new features are built with reliability in mind.
  10. Advocate for best practices in software engineering, deployment, and operations and foster a culture of collaboration and continuous improvement across teams.
  11. Conduct capacity planning to anticipate future growth and scaling needs.
  12. Implement strategies to efficiently scale systems based on demand.
What can you expect from Mars?
  1. Work with over 140,000 diverse and talented Associates, all guided by the Five Principles.
  2. Join a purpose-driven company, where we’re striving to build the world we want tomorrow, today.
  3. Best-in-class learning and development support from day one, including access to our in-house Mars University.
  4. An industry competitive salary and benefits package, including company bonus.

Mars is an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability status, protected veteran status, or any other characteristic protected by law. If you need assistance or an accommodation during the application process because of a disability, it is available upon request. The company is pleased to provide such assistance, and no applicant will be penalized as a result of such a request.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.