Company: Anthropic

Apply for the Research Engineer, RSP Evaluations (Autonomy)

Location: London

Job Description:

We are looking for Research Engineers to build “gold standard” evaluations for catastrophic risks, in order to understand what AI Safety Level (ASL) to assign to models. Research leads on this team collaborate with engineers in one of our focus areas: CBRN, Cyber, Autonomy (this list may expand over time). This will have major implications for the way we train, deploy, and secure our models, as detailed in our Responsible Scaling Policy (RSP).

The policy defines a series of capability thresholds – AI Safety Levels (ASLs) – that represent increasing risks – crossing an ASL threshold would trigger a commitment to more stringent safety, security, and operational measures, intended to handle the increased level of risk.

Please note: We are currently only hiring for the Autonomous Replication and Adaption (Autonomy) threats workstream. We will also be prioritizing candidates who can start ASAP and can be based in either our San Francisco or London office.

Responsibilities

Design and run the evaluations needed to measure dangerous capabilities in models, and determine when we cross an ASL threshold.
Lead projects with world‑class experts in fields such as biosecurity, autonomous replication, cybersecurity, and national security, and experiment with new evals to measure how risky AI systems are.
Inform decisions at the highest levels of the company.

Qualifications

ML‑focused background with engineering and research skills (e.g. experience in Python).
Experience managing research programs comprising dozens of technical and non‑technical experts.
Ability to find solutions to ambiguously scoped problems.
Design and run experiments and iterate quickly to solve machine‑learning problems.
Thrives in a collaborative environment (pair programming is preferred).
Experience training, working with, and prompting large language models.

Sample Projects

ARA risks – build infrastructure and tooling for testing these capabilities, iterating with external ARA experts to scope possible tasks; build custom “testing environments” and new infrastructure.
(Not currently hiring for) CBRN risks – work with external experts in biosecurity to design clear and repeatable CBRN evaluations, using post‑training infrastructure to prepare new generations of models for routine evaluations.
(Not currently hiring for) Cyber risks – co‑design a set of clear and repeatable cyber evaluations with external cyber experts; build custom environments or extensions to existing tooling, or locate specialized datasets.

Logistics

Location‑based hybrid policy: We expect all staff to be in one of our offices at least 25% of the time.

Visa Sponsorship

We sponsor visas for eligible candidates, and will make every effort to help you relocate to the United States, retaining an immigration lawyer to assist throughout the process.

Compensation and Benefits

Annual Salary: £260,000—£420,000 GBP

We offer a competitive compensation package that includes salary, equity, and benefits that collectively meet or exceed market rates.

Benefits

US Benefits

Optional equity donation matching.
Comprehensive health, dental, and vision insurance for you and all your dependents.
401(k) plan with 4% matching.
22 weeks of paid parental leave.
Unlimited PTO.
Stipends for education, home office improvements, commuting, and wellness.
Fertility benefits via Carrot.
Daily lunches and snacks in our office.
Relocation support for those moving to the Bay Area.

UK Benefits

Optional equity donation matching.
Private health, dental, and vision insurance for you and all your dependents.
Pension contribution matching 4% of your salary.
21 weeks of paid parental leave.
Unlimited PTO.
Health cash plan.
Life insurance and income protection.
Daily lunches and snacks in our office.

#J-18808-Ljbffr…

Posted: June 18th, 2026