AI Just Broke Wikipedia and Nobody Noticed for Weeks

The Open Knowledge Association has been using AI to translate Wikipedia articles at scale. Good intentions. Bad execution. The AI hallucinated sources. Made up citations. Replaced real references with fabricated ones. Published them to Wikipedia where millions of people trust the information.

Wikipedia editors eventually caught it. Restrictions are being placed on OKA translators. Some are getting blocked outright.

The scary part isn't that the AI hallucinated. We know models do that. The scary part is how long it went undetected.

The speed trap

When AI makes you 10x faster, the temptation is to produce 10x more output. Translate 10x more articles. Send 10x more emails. Generate 10x more reports. Ship 10x more code.

But your review capacity didn't scale 10x. Your quality checks didn't scale 10x. Your ability to catch errors didn't scale 10x.

So you have a 10x throughput increase with a 1x quality check. The math guarantees that errors slip through. Not occasionally. Regularly.

OKA translated thousands of articles. How many human reviewers checked those translations? Based on the results, not enough. Probably not even close.

Confidence without competence

The specific failure mode here is nasty. The AI didn't flag uncertain translations. It didn't say "I'm not sure about this source, please verify." It just invented a citation and presented it as fact. Same formatting. Same tone. Indistinguishable from the real ones.

This is the fundamental challenge with using language models for anything that requires factual accuracy. The model doesn't know what it doesn't know. It generates plausible output. Sometimes plausible and correct are the same thing. Sometimes they're not. And you can't tell the difference by looking at the output alone.

The only reliable way to catch these failures is structural. Build verification into the pipeline. Don't rely on the output looking correct.

What good AI oversight looks like

I've been thinking about this a lot in the context of deploying AI agents for businesses. Here's what I've landed on:

Separate generation from validation. The AI that produces the output should not be the only thing checking the output. Use a different model, a different prompt, or a human to verify. Same-model self-checks catch some errors but miss the systemic ones.

Build confidence scores into the workflow. Not every output needs the same level of scrutiny. Train your system to flag low-confidence outputs for human review. Let the high-confidence stuff flow through automatically. Focus human attention where it matters most.

Log everything. If you can't audit what your AI produced last Tuesday, you can't catch problems that take time to surface. Wikipedia's editors could only find the hallucinated sources because the edit history was preserved. Your AI workflow needs the same kind of trail.

Set error budgets. Decide in advance how many errors per thousand outputs are acceptable. Measure against that budget. When you exceed it, slow down and figure out why before scaling back up.

The uncomfortable truth

AI makes smart people feel comfortable removing guardrails. "It's so good now, we don't need to check everything." That's the road to the Wikipedia situation.

The better the AI gets, the more subtle the errors become. GPT-3 hallucinations were obvious. GPT-5 hallucinations look like real citations with plausible URLs and author names.

Don't confuse "fewer obvious errors" with "fewer errors." Increase your scrutiny as your models improve, not the opposite. The errors that survive better models are the ones that are hardest to catch.