That’s almost the title of a paper I read today, Do Multilingual LLMs Think In English?, which uses several methods to poke into what a language model actually does when responding to a prompt in a language other than English. Spoiler: it looks as though it goes via English even if it’s trained on many languages, because there is more English training data. They have a couple of really nice examples that help explain how this works that I’ll show you below.

Multilingual models are trained on data in many different languages, and since every token is “translated” into a vector (a list of numbers or coordinates that locate the concept/token in relation to each other in a big multidimensional imaginary space) the model itself doesn’t exactly have a language. Multilingual models can for instance recognise that a news story titled “United Kingdom buries Queen Elizabeth II after state funeral” and one titled ”?????????????????????????????96?” (translated: Her Majesty Queen Elizabeth II of the United Kingdom of Great Britain and Northern Ireland dies at 96)”) are about the same event (the example is from here). But as you can see the meaning isn’t actually precisely the same.

It seems likely that if most of the training data is in English, then the relationships between different concepts might be culturally closer to the way concepts are thought about in English-speaking countries. We also know that LLMs are worse at smaller languages, even languages spoken by a lot of people. However, there are also people who argue that multilingual LLMs might be “language agnostic”.

The preprint (which not yet peer-reviewed as far as I know) paper titled “Do Multilingual LLMs Think In English?” found several types of evidence that models are in fact going “via” English even when prompted in other languages and generating responses in other languages.

The authors are Lisa Schut, a PhD student in machine learning at Oxford, her supervisor Yarin Gal, and Sebastian Farquhar, who is a senior research scientist at Google Deepmind working on “reducing the expected harm of catastrophically bad outcomes from AGI”. Farquhar also works with Gal and Schut at Oxford Applied and Theoretical Machine Learning Group.

Schut and her co-authors used three different techniques to see whether a set of LLMs did indeed prioritise English. Here is a diagram showing the results of a logit lens for a model asked to complete the sentence Le bateau naviguait en douceur sur l’, which is French for “The boat sailed smoothly on the calm of the”. A direct translation doesn’t really work in English grammar. The logit lens is a way of seeing the intermediate steps the model takes to generate the response. In this figure, each row is one set of tokens the model generated on its way to the final output. You can see the first iteration shown is mostly English words. Associations you can sort of see makes sense – if we’re taking about boats floating then concepts like river, water, lake, weather, sol (sun) seem related. Take a look at each line of words to see how the model moves associatively from one concept to the next.

Beyond the English words for lake, river, etc, notice the place names are from the USA. Place names often sort of attract lots of connections in LLMs because they are often close to interesting description. It’s obvious that Ontario might be associated with lakes because “Lake Ontario” would presumably count as two tokens that are very often next to each other in the training data. Westbrook is a mid-sized coastal town in Connecticut which is often mentioned in connection with boating, marinas and so on – there are lots of websites aimed at tourists about this. Now a bigger city like New York might have as many associations with boating and marinas but would also be connected with a lot of other topics. So Westbrook is more closely associated with marine themes than New York would be. Here are some screenshots of websites that I am guessing are in the training data:

Also, isn’t it interesting that some of the tokens are censored?

Now these intermediate layers are “fuzzy”: the LLM hasn’t actually settled on a token yet. We’re also not seeing all the iterations. Maybe “Westbrook” was a semi-random token, there for a moment and gone the next. Clearly it’s not part of the final output. But it’s part of the process.

Here is Figure 3 in their paper, which shows a similar logit lens process for a Dutch phrase. Here too there are many English words before the model ends up with the response shown in the bottom row.

Lisa Schut, Yarin Gal and Sebastian Farquhar write that the logit lens analysis shows that the model is routing lexical words like nouns and verbs through English:

In general, lexical words – nouns and verbs – are often chosen in English. These parts of speech influence the semantic meaning of the sentence. Other parts of speech, such as adpositions, determiners and compositional conjugates are infrequently routed through English in Aya-23-35B and Llama-3.1-70B.

The rest of the paper also discusses two other methods they used to test whether LLMs “think” in English, vector steering and “causal tracing to determine whether facts in different languages are encoded in the same part of the model”. Neither of these are explained in as much depth though so I’ll leave them for now – but they conclude that while, facts encoded in different languages do seem to share a common representation (which would suggest the model is language agnostic), they also find that “model output is most frequently in English, further underlining the English-centric bias of the latent space.”

A section of the paper offers an overview of current research on multilingual LLMs, so might be useful if you, like me, find this interesting. For us humanities people, I also found this point interesting: they write that you can research a model from an internal perspective, so for instance using things like the logit lens or other techniques for figuring out what the model is doing to generate a response, or an external perspective, which means analysing the output. The external perspective is what media scholars and ethnographers and literary scholars and so on tend to do, and what Hermann Wigers and I did in our paper analysing AI-generated stories from different countries, because we’re good at analysing texts and images. Anyway, Schut, Gal and Farquhar write:

Having a unifying theory that combines both perspectives is important – the internal perspective helps us understand the mechanisms underlying behavior, while the external perspective examines the real-world impact of that behavior.”

So that’s a call for us to figure out how to work together.

References

Rettberg, Jill Walker, and Hermann Wigers. “AI-Generated Stories Favour Stability over Change: Homogeneity and Cultural Stereotyping in Narratives Generated by Gpt-4o-Mini.” Open Research Europe, vol. 5, no. 202, 2025, p. [version 1; peer review: awaiting peer review], https://doi.org/10.12688/openreseurope.20576.1.

Schut, Lisa, et al. “Do Multilingual LLMs Think In English?” arXiv:2502.15603, arXiv, 21 Feb. 2025. arXiv.orghttps://doi.org/10.48550/arXiv.2502.15603.


Discover more from Jill Walker Rettberg

Subscribe to get the latest posts sent to your email.

Leave A Comment

Recommended Posts

Screenshot of a paragraph from a New York Times article published May 12, 2026. Text reads: "The price of tomatoes -tart bursts of flavor in salads and sandwiches — surged nearly 40 percent in April from a year ago on a combination of bad weather, high tariffs and climbing transportation costs."
AI STORIES

Genre glitches and unexpected promotional phrases as a sign of AI writing

A genre glitch is a characteristic of LLM-assisted writing where the text suddenly switches genre, typically inserting a short promotional phrase full of sensory details into an informational text. Genre glitches occur when a word in the generated text is heavily associated with a genre or context that is markedly […]

Top of a ransom note from Shinyhunters hacking group. Text reads: "SHINYHUNTERS rooting your systems since '19 ;) ShinyHunters has breached Instructure (again). Instead of contacting us to resolve it they ignored us and did some "security patches"."
Networked Politics University politics

UiB self-hosts the open source version of Canvas, so wasn’t affected by the breach

On May 1st Canvas announced a security breach, and then yesterday the system was hacked. The login page was replaced by a ransom note: if universities don’t pay up by 12 May, student data will be released. Here’s what the login page looked like yesterday: Way back in 2015, when […]

AI and algorithmic culture Networked Politics

AI-generated images, fascist aesthetics: Dieselbrølet and Heimatstrom

My German is pretty dodgy, so when I first saw Heimatstrom on Bluesky, shared by Roland Meyer, a professor of visual culture at Universität Zürich’s Digital Society Initiative, I misinterpreted it and thought it was a far-right campaign. But no, Heimatstrom is a group of left-wing environmentalists using fascist AI […]

Photo of a billboard ad at Oslo S train station showing a smiliing conductor and the text "Du må ikke sove. Joda, bare sov du."
AI STORIES

“Du må ikke sove”: a floating motif detached from its meaning (or: LLMs can write Norwegian but miss cultural references)

There’s a new ad for the train between Stavanger and Oslo in Norway that uses a line from Arnulf Øverland’s famous anti-fascist poem Du må ikke sove (“You must not sleep”). Du må ikke sove, you must not sleep, the ad says. And then it flips it, jovially, joda, bare […]

Academics in Norway: Sign this petition asking for research-based discussions of how to use AI in universities

I just signed a petition calling for Norwegian universities to use research expertise on AI when deciding how to implement it, rather than having decisions be made mostly administratively. ,  If you are a researcher in Norway, please read it and sign it if you agree – and share with anyone else who might be interested. The petition was written by three researchers at UiT: Maria Danielsen (a philosopher who completed her PhD in 2025 on AI and ethics, including discussions of art and working life), Knut Ørke (Norwegian as a second language), and Holger Pötzsch (a professor of media studies with many years of research on digital media, video games, disruption, and working life, among other topics).  This is not about preventing researchers from exploring AI methods in their research. It is about not uncritically accepting the hype that everyone must use AI everywhere without critical reflection. It is about not introducing Copilot as the default option in word processors, or training PhD candidates to believe they will fall behind if they do not use AI when writing articles, without proper academic discussion. Changes like these should be knowledge-based and discussed academically, not merely decided administratively, because they alter the epistemological foundations of research. Maria wrote to me a couple of months ago because she had read my opinion piece in Aftenposten in which I called for a strong brake on the use of language models in knowledge work. She was part of a committee tasked with developing UiT’s AI strategy and was concerned because there was so much hype and so few members of the committee with actual expertise in AI. I fully support the petition. There are probably some good uses for AI in research, but the uncritical, hype-driven insistence that we must simply adopt it everywhere is highly risky. There are many researchers in Norway with strong expertise in AI, language, ethics, working life, and culture. We must make use of this expertise. This is also partly about respect for research in the humanities, social sciences, psychology, and law. Introducing AI at universities and university colleges is not merely a technical issue, and perhaps not even primarily a technical one. It concerns much more: philosophy of science, methodological reflection, epistemology, writing, publishing, the working environment, and more. […]

screenshot of Grammarly - main text in the middle, names of experts on the left with reccomendations and on the right more info about the expert review feature
AI and algorithmic culture Teaching

Grammarly generated fake expert reviews “by” real scholars

Grammarly is a full on AI plagiarism machine now, generating text, citations (often irrelevant), “humanizing” the text to avoid AI checkers and so on. If you’re an author or scholar, they also have been impersonating and offering “feedback” in your name. Until yesterday, when they discontinued the Expert Review feature due to a class action lawsuit. Here are screenshots of how it worked.