Trust / Jun 22, 2026 / 6 min

Chatbots Are Training the Chatbots

Whistleblowers tell New Scientist that gig workers paid to produce high-quality human training data are secretly using ChatGPT to do the job — threatening the data pipeline frontier labs are betting their IPOs on.

Thesis Frontier AI's quality story rests on human feedback data, but the gig economy built to supply it incentivizes bots training bots — and nobody is auditing the rot before the listings price perfection.

Workers paid to supply the high-quality human conversations that train frontier AI models are secretly using ChatGPT to do the job — and multiple whistleblowers told New Scientist the cheating is widespread across the industry.

This is not a scandal about lazy freelancers. It is a structural threat to the RLHF pipeline that OpenAI, Anthropic, and Google claim makes their next models smarter.

Why this matters now:

Labs scraped the open internet for their first training runs. Now they need curated human judgment — conversations, tests, preference rankings — to push models past the data wall.
That work is outsourced to third-party platforms like Outlier, owned by Scale AI, which claims clients including Meta and Cisco on its website.
Workers are often gig contractors on low pay, short contracts, and tools like Hubstaff that screenshot desktops at random intervals.
The economics incentivize cheating: finish faster, get paid, move to the next project.

What the whistleblowers said:

A worker called Alice* says using chatbots is "very widespread" and that every company she worked for had explicit anti-cheating rules but "don't think they can stop it."
Alice told ChatGPT to avoid telltale AI hallmarks like em-dashes: "It's only the sloppiest of users that get caught."
Her verdict on the labs: "If these companies want quality data, then they should offer quality contracts."
Bob*, promoted to leadership at Outlier after illicitly using AI himself, said he caught workers with ChatGPT open in other tabs or folders on their desktop literally named after AI tools.
Carol*, who now uses one LLM to draft scenarios and another to build files, said: "I do worry that I'm actually making it worse."
None of the named companies responded to New Scientist: Outlier, Scale AI, Meta, Cisco, or Google.

The science:

A 2024 Nature paper coined the term "model collapse" — when models train on recursively generated AI content, rare facts disappear and outputs converge toward nonsense.
Mark Lee at the University of Birmingham told New Scientist that recursive AI-on-AI training can collapse model abilities dramatically — researchers sometimes call it "AI cannibalism" or "AI inbreeding."
Lee's critical nuance: catastrophe is unlikely today because some genuine human data still enters the pipeline. "If you have like 10 per cent human data, it mitigates it."
But even partial contamination degrades performance on human-like tasks: "The AI isn't as good at doing human-like tasks. It's an issue, because I think the models aren't as good as they could be."

Who's exposed:

OpenAI and Anthropic are racing toward IPOs that will price "continuous improvement" as a given.
GPT-5.6 and the next Claude generation depend on human-feedback data that labs cannot fully audit.
The industry spent $43.3 million on congressional races this cycle — but zero of that buys provenance for the training tables inside the models.

Why labs can't catch it:

Screenshot monitoring catches the sloppy. It does not catch workers who instruct chatbots to mimic human style.
Third-party outsourcing means labs inherit a supply chain they do not control — the same opacity that made npm a geopolitical attack surface last week.
Without cryptographic provenance or randomized live verification, "human quality" is an honor system priced like a commodity.

Convina's view: The frontier labs sell a progress story built on human judgment — then outsource that judgment to underpaid contractors racing the clock. The result is not model collapse tomorrow; it is a slow poisoning of the one input labs cannot synthesize. IPO decks will boast of RLHF scale. Nobody will show you the audit trail. Until provenance becomes a procurement requirement, "human in the loop" is marketing — not a control.

Research Signals

https://www.newscientist.com/article/2531050-people-training-new-ai-models-admit-they-just-get-chatbots-to-do-it/ https://www.nature.com/articles/s41586-024-07566-y https://www.ibm.com/think/topics/model-collapse