MATS Applications Open (Due Aug 29)

Aug 19

Written By Neel Nanda

TLDR

I am looking for people who want to be supervised by me to write a mech interp paper. Apply here now! Due Aug 29
Application task: Spend ~12 hours (max 20) working on a mechanistic interpretability research problem of your choice, and send me a write-up + executive summary of what you learned. (See advice, details, past examples and recommended problems in the doc)
The top ~32 candidates will do a 5 week paid online exploration phase (Sept 29 - Oct 31) ending in a 2 week research sprint in pairs.
- Expect unstructured, self-driven learning
The ~8 exploration phase candidates with the best sprint projects advance to the research phase, a 12 week paid and in-person program (Jan 5 - March 27)
- I have 1.5 hr/week check-ins with each pair, supervising them as they write a paper.
- The typical scholar publishes at least one co-first author paper at a top ML venue.
All backgrounds & experience levels welcome - I want to work with the most promising people, not just those with the best credentials!
- Past scholars include professors, undergrads with no mech interp experience, startup founders, and researchers with several great mech interp papers already

The key details and FAQ are copied below for convenience, the rest are in the doc

Key Details

FAQ

Application Task Details

Advice on producing a good application in 20 hours

What does a good application look like?

Key Details

Application task: Spend ~12 hours (max 20) trying tomake research progress on a mechanistic interpretability problem of your choice
- Submit via this form, due Fri Aug 29th 11:59pm PT
- Please submit a write-up and executive summary showing me what progress you madeand what you learnedabout the problem.
  - I value communication skill, don’t rush the write up! The time limit has up to two additional hours for the executive summary.
  - See examples of successful past write-ups here
- See advice on approaching the application, how to use LLMs for research, recommended resources, and how I evaluate applications
  - You can take as much time as you want beforehand for general learning.
- My research interests have changed a fair bit from some of my prior work, I detail these here, and provide a long list of problems I’m currently excited about here.
- I’m open to submissions of existing mech interp work, but hold these to a higher standard (more info)
- If you’ve applied before, see here for a summary of changes
Key dates:
- Applications due Aug 29
- Decisions released Sept 16
- Exploration phase Sept 29 - Oct 31 (5 week online program for top ~32 candidates)
- Research phase decisions Nov 6
- Research phase Jan 5 - March 27 (12 week in-person program for top ~8 candidates)
All experience levels welcome: I want to work with the most promising people, not those who look best on paper.
- In MATS 8.0, 5 of my 8 scholars had minimal prior mech interp experience, but have been doing fantastically - by halfway through the program, some of them had:
  - Helped understand emergent misalignment (i.e. why training a model to write buggy code turns it into a Nazi) and been interviewed about it by MIT Tech Review
  - Explored new paradigms for interpreting reasoning models
- At the other extreme, I’ve had scholars who already had multiple great mech interp papers, like Arthur Conmy & Josh Engels, who say I still added a fair amount of value

FAQ

Why might you want to apply?

My core goal is to teach you how to do great mechanistic interpretability research.
I run the Google DeepMind mechanistic interpretability team and I have a lot of experience supervising research. In the past 3 years, I have mentored 50 junior researchers and supervised 30+ MATS papers, and 15 top conference papers.
The program often helps scholars get into mech interp careers
- Seven now do interpretability research at frontier AGI labs, including Arthur Conmy, who works for me leading the GDM Applied Interpretability Team.
- Two alumni lead research teams at the UK government's AI Security Institute
Past scholars also do excellent research in the program itself, even those totally new to mech interp! Some highlights:
- Showing open source LLMs can be cheaply jailbroken with linear algebra, by ablating the refusal direction
  - This inspired projects at multiple frontier labs, including a Meta paper on fixing it.
- An ICLR oral using sparse autoencoders to interpret hallucinations, and showing models can “recognise” entities they know facts about.
- Using interpretability to shape how models generalize without changing any data, preventing emergent misalignment
- The first paper on transcoders, nine months before Anthropic's well-known papers on transcoders.
- Work exploring fundamental issues in sparse autoencoders and follow-up work that (mostly) fixed them.

Why is this application so much effort?

I care a lot about being meritocratic. This way lets me find the best applicants, not just those who look good on paper. I do my best to assess your potential, not just what you’ve already done (though it’s still super noisy!)
I've also tried to design this application process so that spending time on it is useful whatever the outcome - I don’t want to waste 12+ hours of your time!
I think it's a pretty realistic simulation of doing research, especially if you haven’t done interpretability research before. Candidates often learn a lot, and are surprised by how much they can get done.
- I've sometimes heard from unsuccessful applicants that they enjoyed the application so much it convinced them to pursue a research career!
- If you’re not sure if you’re interested in doing mech interp or not, I’d encourage you to try applying! I think you'll learn a lot from the application about whether it's a good fit.

What am I looking for in an application?

My ideal application is one that teaches me something new.
- This looks like identifying an interpretability hypothesis, gathering evidence for and against it, and writing up the evidence and analysis clearly.
I value clear writing, good taste (ie choosing interesting problems and making good decisions), technical skill, truth-seeking, skepticism and pragmatism
See a much more detailed explanation in this tab, along with past examples

What happens in the program?

The top ~32 candidates will do a 5 week online exploration phase Sept 29 - Oct 31
- The final two weeks (full time) are spent doing a research sprint in pairs. Admission to the research phase is largely based on sprint performance.
- The first three weeks (part time) are the preparation phase. This means preparing for the sprint: self-driven skilling up, doing several day mini research projects with other scholars, going to talks/sessions, reading papers, etc. How you spend your time is up to you
- More info here
The top ~8 candidates from the exploration phase will do a 12 week in-person research phase in Berkeley Jan 5 - March 27
- Scholars work in pairs to write a mech interp paper, with a 1.5 hr/week check-in from me and some Slack support
- All recent scholars have published this as a co-first author paper at a top ML venue (NeurIPS/ICLR/ICML) - see lists of past work below
Research phase participants often do an optional 3-12 month extension, to finish their paper and sometimes publish a second.
All phases include a paid stipend. Housing support is provided in the research phase
See more info at matsprogram.org

What happens if I don’t get through to the research phase?

While unfortunately most exploration phase candidates don’t make it to the research phase, I’ve designed the exploration phase to be a valuable experience in its own right, and to teach useful research skills.
- The median participant rates it as 1.5x-2.5x the counterfactual use of time.
In MATS 8.0:
- 5 exploration-phase only scholars found other MATS 8.0 mentors as a result of participating
- I helped 8-10 exploration phase-only scholars write papers based on their sprint projects
Candidates are welcome to try again in the next cohort

Why shouldn’t I apply?

Obviously, the application takes a while! If it doesn’t sound fun, you probably shouldn’t do it.
The exploration phase of the program is fairly competitive, which some people find very stressful
- Generally, participants seem to be nice and cooperative, especially since you want to form teams, but the awareness of your chances can be very stressful for some
Most exploration phase events happen between 5pm-8pm UK time, which works badly for people in Asian time zones. But the events are not necessary for a valuable exploration phase!
The exploration phase is very self-driven and unstructured - I provide good opportunities, resources, advice, etc and you all have each other as collaborators, but ultimately there’s one of me and 30+ of you. You get out what you put in and need to decide how to spend your time. This works great for some, poorly for others
If you have a full-time job/are otherwise very busy, you may find it difficult to make time for the exploration phase.

How should I choose a problem?

I'm open to any application that shows strong research skill, but will be more excited about those matching my research interests
My research interests have changed a fair bit from some of my past work - more details below, but in brief I’m now fairly pessimistic about ambitious interpretability (i.e. complete reverse-engineering), and I’m excited about model biology (studying qualitative high-level properties of models) and applied interpretability (rigorously doing useful things with interp). I’m still interested in basic science, but have a higher bar.
- Applications that surprise me with something new and cool are fantastic!
I’m more agnostic about the best techniques, things like sparse autoencoders are a useful tool, but easy to waste effort using when a simpler method is sufficient or better - start by doing the obvious thing!
I provide a long list of suggested problems here

Can I use LLMs?

Yes. In fact, I strongly recommend it! LLMs are a crucial research tool nowadays, and are especially useful for those getting into a new field.
- More advice on using LLMs well below
You're welcome to use them for coding, writing, etc, whatever you want - I want to gauge how well you’ll do as a researcher, which includes whatever tools you’d actually use.
- It is your responsibility to ensure your code and writing are high quality. Well-written write-ups are welcome. Docs that read like LLM slop will be rejected.
I recommend using Cursor for coding (replacing eg VS Code) and using Gemini 2.5 Pro for browser based tasks
I've compiled a folder of useful text files for mech interp research, containing a bunch of relevant docs & source code of key libraries, tutorials from ARENA and key libraries, key papers and my relevant blog posts.
- By default, just put this 600k token file in Gemini’s context window, which contains the most important documents.

How does a research supervisor add value?

My model is that research requires a mix of skills. The day-to-day coding and execution is crucial. But there's also a set of harder-to-learn conceptual skills, collectively called research taste. These skills take a long time to gain because they have poor feedback loops, but they take very little time to use.
My main role is to lend you my research taste and bootstrap your own. This looks like helping with:
- High-level Strategy: Choosing a good problem, knowing when to pivot away from a dead end, or prioritizing which of several promising directions to pursue.
- Experimental Design: Designing a clean experiment to conclusively test a hypothesis, thinking of alternative explanations for your results, or knowing when evidence is strong enough.
Navigating the Field: I can also give pointers to relevant papers or techniques you might be missing, helping you avoid reinventing the wheel.
Finally, some people find it very helpful to have a de-facto light-touch manager who provides validation, accountability, and clarity.
Past scholars have given me the feedback that I’m good at red-teaming, generating ideas, and being motivating and invested in their projects, but that I expect people to be able to work independently and can be fairly blunt with feedback.

See a bunch more info and guidance in the other tabs of the doc

$\setCounter{0}$

Neel Nanda

MATS Applications Open (Due Aug 29)

TLDR

Table of Contents

Key Details

FAQ

Why might you want to apply?

Why is this application so much effort?

What am I looking for in an application?

What happens in the program?

What happens if I don’t get through to the research phase?

Why shouldn’t I apply?

How should I choose a problem?

Can I use LLMs?

How does a research supervisor add value?

Post 51: Socratic Persuasion: Giving Opinionated Yet Truth-Seeking Advice

Neel Nanda