Loading...
Loading...
AI hallucination incidents where LLMs confidently fabricate facts, cite nonexistent sources, or invent false information. Documented cases from ChatGPT, Claude, Gemini and more.
11 reports in this category
A user asked GPT 5.2 a simple question: I want to wash my car, the car wash is 50 meters away, should I walk or drive? GPT 5.2 with high reasoning enabled said walk. The obvious answer is drive — you need to bring the car to the car wash. Claude and Gemini both answered correctly. HN commenters connected this to the 1969 frame problem in AI — models process every stated fact but cannot infer unstated common-sense context that humans take for granted. 1,400+ points on Hacker News, 1.8M+ impressions on X.
An AI-generated article on "Press Start Gaming" describes enhanced graphics, weather effects, day-night cycles, and fluid animations for Phantasy Star Fukkokuban — a 1994 Sega Genesis game that is literally just the original Master System ROM on a Genesis cartridge with zero changes. The fabricated article ranks third on DuckDuckGo between GameFAQs and The Cutting Room Floor. The AI had insufficient training data on the obscure title and generated plausible but completely fictional features.
Ars Technica published an article covering the matplotlib AI agent hit piece incident. The matplotlib maintainer Scott Shambaugh commented that the quotes attributed to him in the article were entirely made up — they did not exist in his original blog post. He noted they appeared to be AI hallucinations themselves. Ars subsequently pulled the article. The meta-irony: an article about AI fabricating content contained AI-fabricated quotes.
**In early 2024**, a major scientific publisher retracted hundreds of papers that were generated by AI — featuring nonsensical diagrams, fabricated data, and results that had somehow sailed through the peer review process. The retractions were the tip of an iceberg that researchers estimate now affects up to 17.5% of all scientific papers, which show signs of generative AI use. The crisis has been fueled by industrialized "paper mills" — companies that sell AI-fabricated research papers to scientists seeking publication credits. These operations use AI to produce fake studies at scale, targeting fields where fabrication is hard to detect. Cancer research has become a particular hotbed: someone can claim to have tested interactions between a tumor cell and one of thousands of proteins, and as long as the findings aren't dramatic, the paper passes review. The mills recycle templates and produce multiple papers with closely matching text, making detection a game of whack-a-mole for publishers. The problem extends beyond complete fabrications. "Phantom citations" — references to papers that don't exist, hallucinated by AI — have infiltrated even respected journals. Researchers report finding fake citations of their own work in manuscripts they're asked to review. The Trump administration's "MAHA Report" on children's health contained more than half a dozen phantom citations, illustrating how the problem has spread far beyond academic publishing. The consequences for science are profound. In global health, where evidence-based policy can mean the difference between life and death, a corrupted literature is genuinely dangerous. Journal editors and unpaid reviewers are overwhelmed, and the tools to detect AI-generated fraud are perpetually playing catch-up with the tools used to create it. As one publisher warned: "From here on, it's going to be a constant arms race." ---
### What Happened Users reported GPT-5 delivering wildly nonsensical responses shortly after its launch, including telling a user that their elderly photo depicted their "future children with an Asian man" — a response that had no connection to reality, the image, or the user's question. The thread on Reddit's r/ChatGPT quickly filled with similar reports of the model producing bizarre, hallucinatory outputs that went well beyond normal AI errors into genuinely surreal territory. ### The AI Response The incident highlighted a persistent problem with large language models: they can fail in ways that are not just wrong but incomprehensibly wrong. Unlike a calculator that might give an incorrect answer, LLMs can generate responses that seem to come from an entirely different conversation — or from no rational conversation at all. Users reported GPT-5 making confident claims about photos that bore no relationship to what was actually depicted, inventing biographical details about people in images, and producing responses that seemed to combine fragments from unrelated queries. OpenAI had marketed GPT-5 as a significant leap in reasoning and reliability, making the glitches particularly embarrassing. Some users speculated that the errors might be related to the model's multi-modal capabilities (processing images alongside text) interacting in unexpected ways, or that the routing system that sends queries to different underlying models was sometimes delivering responses from the wrong context. ### The Aftermath The reports added to growing skepticism about AI companies' reliability claims. While individual errors can be laughed off, the pattern of confident-sounding nonsense poses real risks in any context where users might trust the AI's output without verification — from medical questions to legal research to educational content. The "future children with an Asian man" response became a viral example of AI hallucination at its most absurd. ---
### What Happened A user debugging PHP code received what might be the most hilariously off-topic ChatGPT response ever: instead of help with their code, the AI delivered a detailed essay about "Pune's Heatwave Alert: Stay Cool and Hydrated" — complete with tips on staying safe in India's summer heat. The response had zero connection to the programming question being asked. ### The AI Response When the user pointed out the bizarre non sequitur, ChatGPT seemed to acknowledge the error in its own strange way, calling it "a rogue reply from a tool call that went off-script." The incident, shared on Reddit's r/ChatGPTPromptGenius community, illustrated one of the more puzzling failure modes of modern AI: cross-contamination between different tool calls or conversation threads. The response appeared to originate from ChatGPT's tool-use architecture, where the model can call external tools (like web search) to gather information before responding. In this case, something went wrong in the handoff — the model apparently processed results from a completely unrelated query (possibly another user's or an internal tool call) and delivered those results instead of debugging help. ### The Aftermath While the incident was more comical than harmful — no one was hurt by receiving weather advice instead of code help — it raised legitimate questions about the reliability of AI assistants in professional settings. If ChatGPT can randomly answer a coding question with weather advice for a city on another continent, what other critical contexts might it silently get wrong? The incident became a popular example of AI's unpredictable nature, shared widely as evidence that these systems still have fundamental reliability issues that make them unsuitable for high-stakes work without human verification. ---
### What Happened When Jake Moffatt needed to fly home for a family bereavement, he did what most travelers do — he checked the airline's website. Air Canada's AI-powered chatbot confidently told him he could book a full-fare ticket and apply for a bereavement discount retroactively within 90 days. There was just one problem: that policy didn't exist. The chatbot had fabricated it entirely. ### The AI Response After Moffatt booked his flights based on the chatbot's advice, Air Canada refused to honor the non-existent bereavement fare. In a move that stunned legal observers, the airline then argued in court that its chatbot was essentially a "separate legal entity" — responsible for its own actions, not the airline's. The tribunal wasn't buying it. Christopher Rivers, the tribunal member who decided the case, delivered a ruling that became an instant landmark in AI liability law: "It should be obvious to Air Canada that it is responsible for all the information on its website. It makes no difference whether the information comes from a static page or a chatbot." He ordered Air Canada to honor the hallucinated refund policy and pay Moffatt the difference. ### The Aftermath The case exposed the risks of deploying AI customer service without adequate safeguards. Air Canada had invested heavily in its chatbot — its VP of digital Mel Crocker had openly stated that the initial AI investment was "much higher than the cost of continuing to pay workers," but worth it because automation would "fundamentally" create "a better customer experience." Experts noted Air Canada could have avoided liability simply by adding a disclaimer that chatbot information might not be accurate. The ruling set a clear precedent: if you deploy an AI chatbot, you own what it says. ---
### What Happened A federal judge ordered two attorneys representing MyPillow CEO Mike Lindell to pay $3,000 each after they used AI to prepare a court filing stuffed with hallucinated cases and more than two dozen mistakes. Attorneys Christopher Kachouroff and Jennifer DeMaster filed the document in a Colorado defamation case — and when Judge Nina Y. Wang asked about AI use, Kachouroff wasn't forthcoming. "Not until this Court asked Mr. Kachouroff directly whether the Opposition was the product of generative artificial intelligence did Mr. Kachouroff admit that he did, in fact, use generative artificial intelligence," Wang wrote. ### The AI Response Kachouroff tried to blame DeMaster for "mistakenly" filing a draft version instead of a more carefully edited copy. Judge Wang wasn't persuaded, calling the explanation into question and noting the violations were "egregious." The sanctions were part of a broader defamation case where Lindell — the conspiracy theorist known for spreading election lies — was ordered to pay former Dominion Voting Systems employee Eric Coomer more than $2 million. Professor Maura Grossman of the University of Waterloo called the $3,000 fines "reasonably light, given these were not unsophisticated lawyers who just really wouldn't know better." She advised other lawyers who find themselves in similar situations: "You are likely to get a harsher penalty if you don't come clean." ### The Aftermath The case is part of a growing epidemic. Researcher Damien Charlotin, who tracks AI hallucination cases in courts worldwide, told NPR his database had reached 206 cases — and that's only the ones where courts formally addressed the issue. "I suspect there are many, many, many more, but just a lot of courts and parties prefer not to address it because it's very embarrassing for everyone involved," he said. ---
### What Happened Researcher Damien Charlotin maintains what has become the definitive global database of court cases where AI-generated hallucinations — fabricated case citations, fake quotes, and misrepresented legal arguments — have been identified and addressed by courts. The growth curve is staggering: 10 cases in 2023, 37 in 2024, 73 in just the first five months of 2025, and over 200 total by January 2026, with new cases "popping up every day." ### The AI Response The database reveals three main categories of AI legal hallucinations. The most obvious are completely fabricated cases — citations to court decisions that simply don't exist. Second are fake quotes attributed to real cases. Third, and hardest to detect, are citations where the case name and citation are correct but the legal argument being attributed to the case isn't actually supported by it. The cases span the globe — from U.S. federal courts to UK tribunals, Indian high courts, Israeli magistrate courts, and Australian proceedings. Both lawyers and pro se litigants have been caught, with consequences ranging from simple warnings to monetary sanctions of thousands of dollars, referrals to state bars, and orders to provide hard-copy source documents for every cited authority. ### The Aftermath Charlotin told NPR that his database doesn't capture every instance: "I suspect there are many, many, many more, but just a lot of courts and parties prefer not to address it because it's very embarrassing for everyone involved." As judges noted in one sanctioned case: "Submissions containing unverified authority divert limited resources from other litigants who rely on their advocates' careful research, accurate citation, and disciplined advocacy." The database stands as a growing monument to what happens when AI tools meet the legal profession without adequate verification. ---
Attorney Amir Mostafavi used ChatGPT and other AI tools to "enhance" his appellate briefs in the case Noland v. Land of the Free, L.P. — then filed them without bothering to verify a single citation. A California Court of Appeals found that 21 of 23 case quotations in his opening brief were completely fabricated by AI, along with many more in his reply brief. Some cited cases didn't discuss the topics they were referenced for. Others didn't exist at all. "Nearly all of the legal quotations in plaintiff's opening brief, and many of the quotations in plaintiff's reply brief, are fabricated," the court stated. It imposed a $10,000 sanction on Mostafavi and referred him to the state bar, noting that his fabricated citations had "required this court to spend excessive time on this otherwise straightforward appeal." But the case added a fascinating wrinkle to AI hallucination law. The court also declined to award attorneys' fees to opposing counsel — because they had failed to notice or report the fake citations. This may be the first judicial decision suggesting that lawyers have a duty to detect and flag their opponents' AI-generated fabrications. As legal commentators noted, before generative AI, lawyers could reasonably assume cited cases actually existed. In the AI era, that assumption may no longer be safe. The ruling came amid an explosion of AI hallucination cases in courts worldwide. Researcher Damien Charlotin's database tracking such cases has grown from 10 cases in 2023 to 37 in 2024 to 73 in just the first five months of 2025 — and over 200 total by early 2026. It's an exponentially growing problem that threatens the integrity of legal proceedings everywhere. ---
### What Happened When Google rolled out AI Overviews to millions of search users in May 2024, the feature was supposed to deliver a better, AI-enhanced search experience. Instead, it became an instant meme machine. The AI confidently told users to eat rocks for nutrients (sourcing its advice from satirical site The Onion's joke article) and suggested adding Elmer's glue to pizza to help the cheese stick better (pulled from an 11-year-old Reddit comment by a user named "fucksmith"). ### The AI Response The absurd recommendations went viral immediately. Users discovered AI Overviews suggesting all manner of dangerous or nonsensical advice — and Google had opted millions of people into the feature without a way to easily disable it. The backlash was swift, spawning countless memes and articles explaining how to work around the AI results. Google's head of search, Liz Reid, published a blog post defending the feature while acknowledging the problems. She blamed "data voids" and "nonsensical new searches, seemingly aimed at producing erroneous results," while arguing that AI Overviews generally don't "hallucinate" — they just "sometimes misinterpret what's already on the web." The company also pointed out that many viral screenshots of AI Overviews were faked, though plenty of real ones were bad enough. ### The Aftermath Google quickly moved to limit AI Overviews for "nonsensical" queries and filter out satirical and humor sites. But the damage was done. The incident became a PR disaster that undermined trust in Google's AI capabilities at a critical moment — the company was racing to compete against OpenAI and search startups like Perplexity. The glue-on-pizza and eat-rocks moments became shorthand for what happens when AI is deployed too quickly without adequate quality controls. ---