London, UK Hybrid Employment £65,000–£145,000

AI Security Institute is hiring a Research Engineer/Research Scientist – Model Transparency

Responsibilities

  • Developing a chain-of-thought monitorability benchmark and comparing monitorability properties across frontier AI systems, leveraging AISI’s unique access to reasoning traces from multiple labs.
  • Designing and running experiments on open-weight models to study alignment and oversight-relevant phenomena – such as reproducing emergent misalignment from reward hacking, or red-teaming techniques like inoculation prompting and character training.
  • Using white-box and interpretability methods – such as activation oracles, sparse auto-encoders or probes – to detect misalignment that isn’t visible through behavioural evaluation alone.
  • Building tooling and infrastructure for our research – including agent orchestration, large-scale RL pipelines, mechanistic interpretability methodologies, and auditing agents.
  • Reviewing frontier lab risk assessments and safety cases, providing independent analysis of alignment claims before deployment decisions.
  • Conducting literature reviews and expert interviews to map the state of model transparency risks and inform AISI’s strategic priorities.
  • Translating technical findings into actionable insights for AISI evaluation teams, UK government officials, and international partners.

Requirements

  • A get-things-done mindset – you take ownership, move fast, and care about shipping work that matters.
  • A combination of self-sufficiency and enthusiasm for teamwork – you’re equally happy defining your own agenda and contributing to shared goals.
  • You’re excited about growing, giving and receiving feedback, and building something together.
  • An ability to build, supervise and orchestrate AI agents to complete tasks effectively, while verifying and maintaining quality of work.
  • A demonstrated track record of relevant, high-quality work – whether technical publications, blog posts, or other publicly visible contributions.

Team

Structure: The Model Transparency team is a research team within AISI focused on ensuring that evaluations, assessments, and monitoring of frontier AI systems remain reliable as models become less transparent.

Additional Information

  • The deadline for applying to this role is Sunday 24th May 2026, end of day, anywhere on Earth.
Scaling your freelance income?

Invoice multiple clients effortlessly

Managing 3+ international clients? Glopay streamlines everything. One EU company, unlimited invoices, automatic compliance. You just send and get paid.

Unlimited clients & invoices
Multi-currency support
Automated tax compliance
Client portal for easy payments
Scale with Glopay
Trusted by 10,000+ freelancers
About company
AI Security Institute
The AI Security Institute is the world's largest and best-funded team dedicated to understanding advanced AI risks and translating that knowledge into action. It is positioned within the UK government with direct lines to the Prime Minister's office and works with frontier developers and governments globally.
All jobs at AI Security Institute Visit website
Job Details
Department Model Transparency
Category other
Posted 2 hours ago