Aaron Mueller

Prospective Students

Thanks for your interest in joining our lab! We're recruiting PhD students and postdocs to start at BU in Fall 2025. This page is written to address questions on what our lab works on, the structure of the group, and how to apply. If you have additional questions that are not answered here, please feel free to email me with [Prospective Student] in the subject line.

Applying

Prospective PhD students: If you are interested in joining our lab and are not currently a BU student, please apply to the Boston University Computer Science PhD program by December 15. I will review all applications that mention my name. Please do not email me with your application materials. During admissions, if I see a good fit, I will contact you for an interview.

Prospective postdocs: Please contact me with your CV, a description of your research interests, and any outside funding you plan to apply for. Ideally, you should reach out about a year before you'd like to start. This may seem early, but many fellowships have very long timelines.

Current BU PhD students who are not my advisees: Email me and we can discuss!

Current BU master's students and undergraduates: Feel free to reach out, but note that I have limited availability for MS and undergraduate supervision. I will likely only agree to work with students who have performed well in at least one of my advanced courses, unless you already have relevant prior experience.

What does our lab work on?

Broadly, our lab's areas of research are natural language processing (NLP), computational linguistics, interpretability, and evaluation. Our main aim is to design methods, datasets, and theoretical frameworks that (i) reveal how language is learned, understood, produced, and used in natural language systems (including the mind); (ii) allow us to decipher and precisely edit/control the causal mechanisms underlying these capabilities in language models; and (iii) enable more efficient and robust language modeling.

We value linguistics and cognitive science expertise as much as machine learning expertise. Currently, our methods focus primarily on neural networks, but precedent suggests that this could quickly change. Regardless of the methods used to model it, our lab will always work with language data.

Below is a summary of the directions we are currently pursuing.

  1. Interpretability. Language models can accomplish amazing things, but they also often fail at surprisingly simple tasks. Can we understand and predict how and why language models will generalize in particular ways? What causal mechanisms underlie their behaviors, and can we edit these to improve generalization?

  2. Evaluation. What are language models capable of? Do they process or produce language in human-like ways? What should we even be measuring? None of these questions are settled, but answers to them have significant implications for the kinds of work we should be pursuing.

  3. Sample efficiency. By 13 years old, a human acquires the ability to robustly understand and produce language after being exposed to less than 100 million words. Conversely, state-of-the-art language models are exposed to billions to trillions of words—far more than a human would hear or read in their lifetime! How can we improve language models given a more human-like amount of linguistic data? What kinds of signals and methods will be required for this, and can cognitive science/linguistics inspire better methods? Building efficient systems has many practical and scientific advantages, including the following:
    • Learnability. What kinds of data are required for a particular phenomenon to be learned? We can empirically test this if we ensure that our datasets are cognitively plausible.
    • Accessibility. If less data is required to train better systems, it becomes faster and easier to iterate on language modeling methods and architectures. It also reduces the financial and computational opportunity cost of training a good language model, which enables more diverse research directions.

  4. Causality. When we give explanations of how or why certain systems (whether computational or human) behave in certain ways, we want our explanations to be causally efficacious. That is, we want to capture the true graph of causes and effects, rather than merely capturing commonly co-occurring events that do not actually explain the behavior. But what do we mean by cause and effect? How can we apply ideas from the causality literature to better analyze and understand language models? Can we automate the process of causally explaining language model behaviors?

Advising philosophy

The scientific process is highly non-linear. There are many solutions to most problems, and many dead ends and challenging obstacles along the way. My role is to guide and unblock you at each step of the process, and to help you learn how to independently conduct research. I will be more hands-on earlier in your career, but with the ultimate goal of guiding you toward the ability to devise and pursue your own ideas.

Goals

The purpose of a PhD program is to train new scientists, to generate new knowledge, and to help one learn the cultural and scientific norms of a research community. By the end of a PhD, one should have the ability to (i) devise deep but well-scoped research questions, (ii) design and be able to implement well-controlled experiments, (iii) lead a focused project to completion, and (iv) present one's ideas and work effectively, both in writing and orally. At a higher level, one should be able to independently design and pursue one's own research agenda.

Group structure

On average, I expect the group to consist of about five to eight PhD students and zero to two postdocs. This should be composed of a mixture of researchers with diverse interests and expertise, from cognitive science and linguistics to deep learning and mathematics. This keeps the group small enough such that everyone knows what everyone else is doing and has regular interaction with each other. This is important, as peer mentoring can often be just as (if not more) effective than advisor-student mentoring! It also keeps the group large enough that many influences and priorities are always informing the direction of our work.

Interaction

I want to play an active role in my students' research projects. I plan to meet one-on-one with each of my students at least once a week. These meetings can consist of anything from project planning and technical discussion to career planning and general life check-ins. This is in addition to project-specific meetings. We will hold a formal lab meeting once a week.

Work-life balance

Work-life balance is essential for one's intellectual and physical health. We encourage a culture of pursuing hobbies outside of Boston University. We also plan to do at least one social lab outing each semester. (Of course, the pace and amount of work necessary for successful research can vary widely from week to week! But on average, I believe that a healthy balance will lead to better long-term outcomes.)