Evals Platform Engineer

COL Limited
London
GBP 40,000 - 80,000
Job description

Applications deadline: The final date for submissions is 25 April 2025. However, we review applications on a rolling basis and encourage early submissions.

ABOUT APOLLO RESEARCH

The capabilities of current AI systems are evolving at a rapid pace. While these advancements offer tremendous opportunities, they also present significant risks, such as the potential for deliberate misuse or the deployment of sophisticated yet misaligned models. At Apollo Research, our primary concern lies with deceptive alignment, a phenomenon where a model appears to be aligned but is, in fact, misaligned and capable of evading human oversight.

Our approach focuses on behavioral model evaluations, which we then use to audit real-world models. We also combine black-box approaches with applied interpretability. In our evaluations, we focus on LM agents, i.e. LLMs with agentic scaffolding similar to AIDE or SWE agent. We also study model organisms in controlled environments (see our security policies), e.g. to better understand capabilities related to scheming.

At Apollo, we aim for a culture that emphasizes truth-seeking, being goal-oriented, giving and receiving constructive feedback, and being friendly and helpful. If you’re interested in more details about what it’s like working at Apollo, you can find more information here.

ABOUT THE ROLE

We are looking for a Platform Engineer to build, scale, and maintain the infrastructure that supports our frontier AI evaluation research, with a strong emphasis on security. As the infrastructure specialist on a small team, you'll have broad decision-making authority on our infrastructure stack. You’ll design and provision the infrastructure that our researchers depend on daily and also contribute to the software platform we build on top of this infrastructure.

You will work as part of a small cross-functional team, collaborating with software engineers and research scientists to ensure our infrastructure is scalable, secure, and efficient.

Responsibilities

  1. Design, implement, scale, and maintain infrastructure for running frontier LLM evals
  2. Work closely with software engineers and researchers to understand and address infrastructure needs
  3. Choose and integrate appropriate technologies for our infrastructure stack
  4. Administer and secure internal AWS accounts
  5. Enforce security best practices
  6. Manage IAM permissions and access control
  7. Manage CI/CD pipelines
  8. Design and build data storage systems for evaluation results
  9. Help set up and manage organisation-wide security processes
  10. Contribute to development of internal software tools that leverage our infrastructure

Required skills

  1. Experience leading infrastructure projects from start to finish
  2. Experience implementing security best practices for cloud and containerized environments
  3. Solid knowledge of AWS, including IAM and EKS
  4. Strong hands-on experience with Kubernetes
  5. Experience with Infrastructure as Code tools
  6. Strong software engineering skills, preferably in Python
  7. Ability to work well with researchers and understand their technical needs

Strong candidates may have some of the following

  1. Experience with Cilium, gVisor, or Karpenter
  2. Experience working with LLM evaluations
  3. Experience building and managing data storage systems
  4. Experience setting up and maintaining data collection, monitoring, and alerting systems.
  5. Exposure to startup environments or early-stage engineering teams
  6. Track record of building and scaling infrastructure from scratch with fast turnaround
  7. Cybersecurity experience

LOGISTICS

  • Start Date: Target of 2-3 months after the first interview.
  • Time Allocation: Full-time.
  • Location: The office is in London, and the building is shared with the London Initiative for Safe AI (LISA) offices. This is an in-person role. In rare situations, we may consider partially remote arrangements on a case-by-case basis.
  • Work Visas: We can sponsor UK visas

BENEFITS

  • Salary: a competitive UK-based salary.
  • Flexible work hours and schedule.
  • Unlimited vacation.
  • Unlimited sick leave.
  • Lunch, dinner, and snacks are provided for all employees on workdays.
  • Paid work trips, including staff retreats, business trips, and relevant conferences.
  • A yearly $1,000 (USD) professional development budget.

Equality Statement: Apollo Research is an Equal Opportunity Employer. We value diversity and are committed to providing equal opportunities to all, regardless of age, disability, gender reassignment, marriage and civil partnership, pregnancy and maternity, race, religion or belief, sex, or sexual orientation.

How to apply:Please complete the application form with your CV. The provision of a cover letter is optional but not necessary. Please also feel free to share links to relevant work samples.

About the interview process: Our multi-stage process includes a screening interview, a take-home test (approx. 2 hours), 3 technical interviews, and a final interview with Marius (CEO). The technical interviews will be closely related to tasks the candidate would do on the job.

Get a free, confidential resume review.
Select file or drag and drop it
Avatar
Free online coaching
Improve your chances of getting that interview invitation!
Be the first to explore new Evals Platform Engineer jobs in London