Zuckerman postdoctoral fellow working with Yonatan Belinkov and David Bau on interpretability and robustness in language models. Incoming Assistant Professor at Boston University Computer Science (Fall 2025).
Email: λ@northeastern.edu, where λ=aa.mueller
I am interested in evaluating and improving the robustness of NLP systems. My work spans causal and mechanistic interpretability methods; evaluations of language models inspired by linguistic principles and findings in cognitive science; and building more sample-efficient language models.
I completed by Ph.D. in Computer Science at the Center for Language and Speech Processing at Johns Hopkins University under the supervision of Tal Linzen and Mark Dredze. My dissertation analyzed the behaviors and mechanisms underlying emergent syntactic abilities in neural language models. My Ph.D. studies were supported by a National Science Foundation Graduate Research Fellowship.
I completed my B.S. in Computer Science and B.S. in Linguistics at the University of Kentucky, where I was a Gaines Fellow and Patterson Scholar. My thesis, which focused on neural machine translation for low-resource French dialects, was advised by Ramakanth Kavuluru and Mark Richard Lauersdorf.
Three papers to appear at ICLR, and three papers to appear at NAACL!
I will be attending and speaking at the Bellairs Workshop on Causality
Invited talk at Tel Aviv University
Invited talk at the Technion
New preprint! Counterfactuals are everywhere in mech interp, but they have key issues that will bias our results if we're not careful.
New preprint! NNsight and NDIF are tools for democratizing access to and control over the internals of large foundation models.
New preprint on the benefits of human-scale language modeling
Invited talks at Saarland University and EPFL
Invited talk at Maastricht University
Presented a paper at NAACL
Invited talk at UCSB
New preprint! We propose sparse feature circuits to discover and edit mechanisms of LM behavior.
Invited talk at Nokia Bell Labs
Invited talks at Brown University and University of Pittsburgh
Our paper on function vectors was accepted to ICLR
The Findings of the BabyLM Challenge are out!
The Inverse Scaling Prize was featured in TMLR
New preprint: in-context learning yields different behaviors on ID vs. OOD examples
Our paper received an outstanding paper award
4 papers at ACL. See you in Toronto!
The BabyLM Challenge was featured in the New York Times