Enable job alerts via email!

Site Reliability Engineer, Compute

Tbwa Chiat/Day Inc

United Kingdom

Remote

GBP 100,000 - 125,000

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a forward-thinking company as a Site Reliability Engineer, where you will play a pivotal role in enhancing our Compute infrastructure. You will design for reliability and performance while managing risks in a dynamic environment. This role involves engaging with development teams, driving continuous improvement through incident management, and implementing automated systems for software delivery. As part of an inclusive community, you will contribute to shaping the future of our infrastructure and ensure our products are built for scale and reliability. If you are passionate about problem-solving and eager to make an impact, this opportunity is perfect for you.

Qualifications

  • 3+ years of experience in an SRE role or 5+ years in a related role.
  • Practical experience with SRE teams and system design.

Responsibilities

  • Engage in end-to-end design, development, and deployment of software.
  • Drive improvements to reliability, performance, and efficiency.

Skills

Site Reliability Engineering (SRE)

Problem-solving

Risk management

Distributed system design

Accountability

Tools

Terraform

Golang

Containers

Virtual Machines

Linux

Job description

Remote - United Kingdom, Germany, Netherlands

About Vercel:

Vercel’s Frontend Cloud provides the developer experience and infrastructure to build, scale, and secure a faster, more personalized web. Customers like Under Armour, eBay, The Washington Post, Johnson & Johnson, and Zapier use Vercel to build dynamic user experiences on the web.

At Vercel, our mission is to enable the world to ship the best products and that goes hand in hand with creating an environment where you can do the best work of your life.

About the Role:

We are looking for experienced SREs to help grow our small team into a global footprint that can provide expert engagement across our core serving systems. As an early member of the SRE team, you will report directly to the Director of Managed Infrastructure and play a foundational role in expanding our SRE practice, integrating reliability principles more deeply into Vercel’s engineering process as we expand.

Within the team, your focus will be on enhancing our Compute infrastructure in close partnership with our EU-based developer team. You will design for reliability and performance while managing for risk as we introduce major innovations to our compute stack.

What You Will Do:
  • Ensure that our products are built for reliability and scale by engaging in the end-to-end design, development, and deployment of new software.
  • Drive continuous risk mitigation and reduction through direct involvement in incident management, blameless postmortems, and follow-ups.
  • Drive measurable improvements to the reliability, performance, and efficiency of our production systems through instrumentation, analysis, and implementation of engineering improvements.
  • Devise repeatable, low-toil operational practices through the development of automated systems for software delivery, system failover, and capacity management.
About You:
  • At least 3 years of experience in an SRE role, or at least 5 years of experience in an adjacent role (e.g., platform engineering), operating in a scaled environment.
  • Firm grasp of the SRE philosophy and mindset, with practical experience working on or directly with SRE teams that have proactively engaged in system design and improvement.
  • Strong sense of accountability and commitment to problem-solving, backed by a curiosity to dig deep and identify root causes.
  • Willingness to proactively engage with development teams to influence the course of software design and operational practices.
  • Capability to manage risk, make decisions, and exhibit sound judgment.
  • Demonstrated ability to plan and deliver long-term projects.
  • Experience with distributed system design.
  • Experience with Containers, Virtual Machines, and Linux.
  • Bonus: Experience working with Terraform and/or Golang.

Vercel is committed to fostering and empowering an inclusive community within our organization. We do not discriminate on the basis of race, religion, color, gender expression or identity, sexual orientation, national origin, citizenship, age, marital status, veteran status, disability status, or any other characteristic protected by law. Vercel encourages everyone to apply for our available positions, even if they don't necessarily check every box on the job description.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Site Reliability Engineer IOE: Cardano

Only for registered members

Remote

GBP 100,000 - 125,000

7 days ago
Be an early applicant

Site Reliability Engineer

Only for registered members

England

Remote

GBP 100,000 - 125,000

11 days ago

Site Reliability Engineer - Contract

Only for registered members

Crawley

Remote

GBP 100,000 - 125,000

14 days ago

Site Reliability Engineer IOE: Cardano

Only for registered members

On-site

GBP 100,000 - 125,000

6 days ago
Be an early applicant

Site Reliability Engineer

Only for registered members

Birmingham

On-site

GBP 100,000 - 125,000

12 days ago

Senior Site Reliability Engineer - EMEA

Only for registered members

Greater London

Remote

GBP 100,000 - 125,000

30 days ago

Remote Site Reliability Engineer

Only for registered members

Remote

GBP 100,000 - 125,000

30 days ago

Site Reliability Engineer

Only for registered members

Remote

GBP 100,000 - 125,000

30 days ago

Site Reliability Engineer

Only for registered members

Remote

GBP 100,000 - 125,000

30+ days ago