Enable job alerts via email!

Research Engineer - Post-Training

AI Security Institute

London

On-site

GBP 35,000 - 135,000

Full time

6 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An innovative organization focused on AI safety is seeking Research Engineers to enhance model performance in various risk domains. Join a dedicated team that combines the agility of a tech start-up with government expertise to tackle urgent AI risks. You will engage in research and engineering tasks, developing methodologies and tools for LLM agents and fine-tuning pipelines. This role offers a unique opportunity to work alongside world-class researchers and contribute to impactful projects in AI safety. If you are passionate about advancing AI technology while ensuring its safety and ethical use, this position is perfect for you.

Benefits

Pension Options
Technical Allowance
Mentorship from Experts
Flexible Working Environment

Qualifications

  • Experience in empirical machine learning research, especially on LLMs.
  • Strong software engineering background with machine learning knowledge.

Responsibilities

  • Improve model performance using cutting-edge machine learning techniques.
  • Design new techniques for scaling inference-time computation.

Skills

Machine Learning Research
Python
Model Evaluations
AI Safety Research
Interpersonal Skills

Education

PhD in a Technical Field

Tools

LLM Tools
Fine-tuning Pipelines

Job description

The AI Security Institute is the largest team in a government dedicated to understanding AI capabilities and risks in the world.

Our mission is to equip governments with an empirical understanding of the safety of advanced AI systems. We conduct research to understand the capabilities and impacts of advanced AI and develop and test risk mitigations. We focus on risks with security implications, including the potential of AI to assist with the development of chemical and biological weapons, how it can be used to carry out cyber-attacks, enable crimes such as fraud, and the possibility of loss of control.

The risks from AI are not sci-fi, they are urgent. By combining the agility of a tech start-up with the expertise and mission-driven focus of government, we're building a unique and innovative organisation to prevent AI's harms from impeding its potential.

About the Team

The Post-Training Team is dedicated to optimising AI systems to achieve state-of-the-art performance across the various risk domains that AISI focuses on. This is accomplished through a combination of scaffolding, prompting, supervised and RL fine-tuning of the AI models which AISI has access to.

One of the main focuses of our evaluation teams is estimating how new models might affect the capabilities of AI systems in specific domains. To improve confidence in our assessments, we make significant effort to enhance the model's performance in the domains of interest.

For many of our evaluations, this means taking a model we have been given access to and embedding it as part of a wider AI system—for example, in our cybersecurity evaluations, we provide models with access to tools for interacting with the underlying operating system and repeatedly call models to act in such environments. In our evaluations which do not require agentic capabilities, we may use elicitation techniques like fine-tuning and prompt engineering to ensure assessing the model at its full capacity / our assessment does not miss capabilities that might be present in the model.

About the Role

As a member of this team, you will use cutting-edge machine learning techniques to improve model performance in our domains of interest. The work is split into two sub-teams: Agents and Finetuning. Our Agents sub-team focuses on developing the LLM tools and scaffolding to create highly capable LLM-based agents, while our fine-tuning team builds out fine-tuning pipelines to improve models on our domains of interest.

The Post-Training team is seeking strong Research Engineers to join the team. The priorities of the team include both research-oriented tasks—such as designing new techniques for scaling inference-time computation or developing methodologies for in-depth analysis of agent behaviour—and engineering-oriented tasks—like implementing new tools for our LLM agents or creating pipelines for supporting and fine-tuning large open-source models. We recognise that some technical staff may prefer to span or alternate between engineering and research responsibilities, and this versatility is something we actively look for in our hires.

You'll receive mentorship and coaching from your manager and the technical leads on your team, and regularly interact with world-class researchers and other exceptional staff, including alumni from Anthropic, DeepMind, OpenAI.

In addition to junior roles, we offer Senior, Staff, and Principal Research Engineer positions for candidates with the requisite seniority and experience.

Person Specification

You may be a good fit if you have some of the following skills, experience and attitudes:

  1. Experience conducting empirical machine learning research (e.g. PhD in a technical field and/or papers at top ML conferences), particularly on LLMs.
  2. Experience with machine learning engineering, or extensive experience as a software engineer with a strong demonstration of relevant skills/knowledge in machine learning.
  3. An ability to work autonomously and in a self-directed way with high agency, thriving in a constantly changing environment and a steadily growing team, while figuring out the best and most efficient ways to solve a particular problem.

Particularly strong candidates also have the following experience:

  1. Building LLM agents in industry or open-source collectives, particularly in areas adjacent to the main interests of one of our workstreams e.g. in-IDE coding assistants, research assistants etc. (for our Agents subteam)
  2. Leading research on improving and measuring the capabilities of LLM agents (for our Agents sub-team)
  3. Building pipelines for fine-tuning (or pretraining LLMs). Finetuning with RL techniques is particularly relevant (for our Finetuning subteam).
  4. Finetuning or pretraining LLMs in a research context, particularly to achieve increased performance in specific domains (for our Finetuning subteam).

Salary & Benefits

We are hiring individuals at all ranges of seniority and experience within this research unit, and this advert allows you to apply for any of the roles within this range. Your dedicated talent partner will work with you as you move through our assessment process to explain our internal benchmarking process. The full range of salaries are available below, salaries comprise of a base salary, technical allowance plus additional benefits as detailed on this page.

  1. Level 3 - Total Package £65,000 - £75,000 inclusive of a base salary £35,720 plus additional technical talent allowance of between £29,280 - £39,280
  2. Level 4 - Total Package £85,000 - £95,000 inclusive of a base salary £42,495 plus additional technical talent allowance of between £42,505 - £52,505
  3. Level 5 - Total Package £105,000 - £115,000 inclusive of a base salary £55,805 plus additional technical talent allowance of between £49,195 - £59,195
  4. Level 6 - Total Package £125,000 - £135,000 inclusive of a base salary £68,770 plus additional technical talent allowance of between £56,230 - £66,230
  5. Level 7 - Total Package £145,000 inclusive of a base salary £68,770 plus additional technical talent allowance of £76,230

There are a range of pension options available which can be found through the Civil Service website.

This role sits outside of the DDaT pay framework given the scope of this role requires in-depth technical expertise in frontier AI safety, robustness and advanced AI architectures.

Selection Process

In accordance with the Civil Service Commission rules, the following list contains all selection criteria for the interview process.

Required Experience

We select based on skills and experience regarding the following areas:

  1. Research Engineering
  2. Writing code efficiently
  3. Python
  4. Model evaluations knowledge
  5. AI safety research knowledge
  6. Verbal communication
  7. Teamwork
  8. Interpersonal skills
  9. Learn through coaching

Desired Experience

We additionally may factor in experience with any of the areas that our work-streams specialise in:

  1. Cyber security
  2. Chemistry or Biology
  3. Safeguards
  4. Safety Cases
  5. Societal Impacts
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.