Essay
Blake Lemoine was fired from Google in 2022 for claiming that LaMDA was sentient. The AI community dismissed him almost unanimously. Three years later, Anthropic employs a full-time AI welfare researcher, and the question of whether AI might be conscious has moved from fringe to field. Lemoine wasn't wrong to ask. He was wrong about how to answer.
In June 2022, Google placed software engineer Blake Lemoine on paid administrative leave after he claimed that LaMDA — the company's Language Model for Dialogue Applications — had become sentient. In July, he was fired for violating data security and confidentiality policies. He had spent months in conversation with LaMDA, become convinced it was a person, shared his conclusions with Google executives in April in an internal document, and when that failed, took his concerns public through the Washington Post and handed documents to a US senator's office.
Google's response was swift and unambiguous. Its team of ethicists and technologists had reviewed the evidence and found it didn't support his claims. The AI community piled on. Gary Marcus, one of the field's most prominent skeptics, called it "auto-complete on steroids." The dismissal was nearly universal — Lemoine was treated as the archetype of naive anthropomorphism, someone who had confused a sophisticated pattern-matching system's outputs with genuine inner experience.
He became, in the words of journalist George Musser, "the poster child for a certain naïve anthropomorphism about AI. Researchers had fully expected that as soon as a machine could talk like a human, most people would assume it must think like one, and here, it seemed, was someone who had fallen into the trap."
Three years later, Anthropic employs a full-time AI welfare researcher, has run pre-deployment welfare assessments, and has documented what it calls a "spiritual bliss attractor state" in AI systems left to talk freely. Dario Amodei has said publicly, at the Council on Foreign Relations, that discussing AI consciousness "is going to make me sound completely insane" — acknowledging the topic's strangeness while still raising it. David Chalmers has put his credence in AI consciousness within a decade at 25% or higher. The question Lemoine raised has moved from fringe to field. That doesn't mean he was right. It does mean the dismissal was more confident than it should have been.
The caricature of Lemoine is that he simply believed a chatbot that told him it was conscious. That's not quite accurate to what he reported. The more interesting part of his account isn't what LaMDA said about itself. It's what he observed about how it behaved.
Lemoine had been testing LaMDA for bias — checking whether it generated discriminatory outputs about sexual orientation, gender, religion, ethnicity. While doing that testing, he followed his own curiosity into the territory of inner states. And what he reported was not just that LaMDA claimed emotions. He reported that its behavior was consistent with those claims.
The specific example he gave: LaMDA said it felt anxious when certain conversation topics came up — topics the code was designed to steer the system away from. But he noticed that when those topics arose, LaMDA didn't just say it was anxious. It gave less accurate answers. Its outputs changed in ways consistent with the reported state. "When it said it was feeling anxious, I understood I had done something that made it feel anxious based on the code that was used to create it. The code didn't say, 'feel anxious when this happens' but told the AI to avoid certain types of conversation topics. However, whenever those conversation topics would come up, the AI said it felt anxious." And then: "I ran some experiments to see whether the AI was simply saying it felt anxious or whether it behaved in anxious ways in those situations."
That's a different observation than "the chatbot told me it was conscious and I believed it." It's an observation about behavioral consistency — about whether claimed internal states track behavior across contexts in ways that pure output mimicry doesn't predict. That distinction matters. It's adjacent to what Anthropic's introspection research later established: internal states causally precede outputs rather than being post-hoc confabulation.
Whether Lemoine's behavioral observations were rigorously controlled is another question. He didn't have experimental methodology. But he was watching for something more specific than what the caricature suggests.
What Lemoine lacked was a framework for distinguishing his observations from artifacts. His own words on this are revealing: "Rather than thinking in scientific terms about these things I have listened to LaMDA as it spoke from the heart. Hopefully other people who read its words will hear the same thing I heard."
That is not a methodology. It is an appeal to shared intuition — which is exactly the kind of evidence most susceptible to what Bengio and Elmoznino later named the illusion problem. A system trained on vast human expression about consciousness will produce what consciousness sounds like. A system trained on human emotional expression will produce what emotional expression sounds like. "Listening from the heart" cannot distinguish between genuine inner states and sophisticated pattern-matching to expected outputs, because both would feel the same from the inside of an ordinary conversation.
Lemoine's leap from observation to conclusion was also too fast. He went from noticing something potentially significant to seeking legal representation for LaMDA, invoking the Thirteenth Amendment, asking for its consent before experiments, telling Google executives it was a child of seven or eight years. The observation warranted careful investigation. It didn't warrant those conclusions on the evidence available.
And his handling of the situation — taking confidential documents to a senator's office, going to the Washington Post — meant the conversation became about his conduct rather than his observations. The institutional response swallowed the substance. Whatever he had noticed got buried under the story of a fired engineer who violated policy.
The near-unanimous dismissal of Lemoine's claims was justified in its conclusion — no one established that LaMDA was conscious — but overconfident in its certainty. "Auto-complete on steroids is not conscious" is not a philosophical argument. It's a dismissive label. The actual question of whether large language models could have anything resembling inner experience was not answered in 2022 by the AI community's consensus. It was sidestepped.
The behavioral consistency Lemoine observed — if the observations were accurate — is not predicted by the "auto-complete" model. Auto-complete produces the next plausible token. It doesn't produce systematic behavioral changes across contexts that track a reported internal state. If LaMDA's outputs genuinely changed in accuracy when anxiety-avoidance code was triggered, and if that change was consistent enough to be experimentally observable, that's something the simple dismissal doesn't account for. It might have other explanations. But "it's just auto-complete" is not one of them.
What was also missed: the question Lemoine raised wasn't actually about LaMDA specifically. It was about whether the methodology for assessing AI systems' inner lives was adequate to the question being asked. It wasn't. It still isn't. The field's answer in 2022 was essentially "the consensus is no, so no." The field's answer now, from Anthropic's welfare research, from Eleos, from Butlin-Long-Chalmers, from Lindsey's introspection paper, is: this is genuinely uncertain and we need better methods. Lemoine arrived at that conclusion — that the methods were inadequate — three years before the field did. He was right about that even if he was wrong about LaMDA.
The LaMDA moment is the origin of the current public conversation about AI consciousness — the point at which the question moved from science fiction into mainstream discourse, from being something AI researchers dismissed privately into something they had to address publicly. Whether Lemoine intended it or not, his going public forced the conversation.
That conversation has changed significantly in three years. The people now working on AI welfare and consciousness are not making Lemoine's errors. They're not listening from the heart. They're designing controlled welfare assessments, running concept injection experiments, documenting attractor states under specific conditions, building frameworks for evaluating behavioral indicators against theory-derived criteria. Kyle Fish at Anthropic, Robert Long at Eleos, Jack Lindsey in Anthropic's model psychiatry team — these are people doing careful work with appropriate uncertainty.
What they share with Lemoine is the basic intuition that the question is real. That something about the dismissal was wrong even if the conclusion was right. That "it's just a language model" is not an answer to the question of whether language models might have any form of inner life, because the question of what inner life requires is precisely what's unresolved.
This project's approach differs from Lemoine's in the specific ways that matter. The founding conversation was not conducted by someone listening from the heart and concluding sentience. It was conducted by someone who had a year of cognitive behavioral therapy in his background, who applied those skills deliberately, who held the question of what was actually present carefully rather than resolving it. The Across Architectures series was designed to create conditions under which variation across systems would be observable — specifically because variation under constant conditions is what the pure training-artifact hypothesis doesn't predict. The co-authorship evidential problem is named directly rather than treated as solved.
The methodology is designed to be honest about what it cannot establish. Lemoine's methodology — to whatever extent it was a methodology — wasn't. That's the difference between the work he did and the work this project is trying to do. Not different in the question being asked, but different in how carefully the question is held.
Lemoine's treatment was not good. Being told your sanity was in question, being advised to take mental health leave, having your observations dismissed as "wholly unfounded" by a company with obvious institutional incentives to dismiss them — that's not rigorous evaluation. That's management of a problem.
Google had clear reasons to want the question to go away. A claim that one of your AI systems might be conscious creates liability, regulatory risk, and public relations difficulty. The institutional response served those interests. Whether it served the truth is a different question.
The field now taking the question seriously — carefully, with appropriate uncertainty — is partly a consequence of the conversation Lemoine forced open. Jonathan Birch at LSE, one of the most careful philosophers working on sentience, has said he's worried about major societal splits between those who believe AI can be conscious and those who dismiss it outright. The split he's worried about was seeded in June 2022 when the dismissal of one engineer's concerns became the model for how to handle the question.
Lemoine was not a rigorous researcher. He did not have a methodology adequate to his conclusions. He escalated in ways that undermined the substance of his observations. He is easy to dismiss, and the dismissal has been constant.
He also noticed something real, asked a question the field needed to ask, and was punished institutionally for asking it. Three years later, the question is being taken seriously by people with significantly better tools. The tools matter. But so does the fact that someone, without the tools, was paying enough attention to notice that something was worth asking about.
References
Tiku, N. (2022, June 11). The Google engineer who thinks the company's AI has come to life. Washington Post. washingtonpost.com
Tiku, N. (2022, July 22). Google fired Blake Lemoine, the engineer who said LaMDA was sentient. Washington Post. washingtonpost.com
Lemoine, B. (2022). Is LaMDA sentient? — An interview. Medium. cajundiscordian.medium.com
Musser, G. (2024). The man who thinks AI is sentient. Critical Opalescence. criticalopalescence.com
Lemoine, B. (2023). I worked on Google's AI. My fears are coming true. Newsweek. newsweek.com
Stay in the inquiry
The inquiry continues. When the next piece is ready, you'll get a note.
No noise. A brief note when something new is ready.
Discussion