A New Plugin Uses Wikipedia's AI Writing Tells to Help Claude Sound Human
The best guide to spotting AI writing on the internet has been repurposed as a manual for hiding it. Over the weekend, tech entrepreneur Siqi Chen released an open source plugin that feeds Wikipedia's meticulously crowdsourced list of AI tells directly to Anthropic's Claude—instructing it to simply not do those things.
The plugin, called Humanizer, has already picked up over 1,600 GitHub stars in roughly two days. Its premise is almost comically simple: take the 24 language and formatting patterns that Wikipedia editors have identified as chatbot giveaways, and tell Claude to avoid every single one of them.
"It's really handy that Wikipedia went and collated a detailed list of 'signs of AI writing,'" Chen wrote on X. "So much so that you can just tell your LLM to... not do that."
Years of Crowdsourced Detection, Inverted in a Weekend
The source material comes from WikiProject AI Cleanup, a volunteer effort founded by French Wikipedia editor Ilyas Lebleu in late 2023. The project emerged as editors noticed a growing flood of AI-generated articles—some obvious, others subtle—polluting the encyclopedia's corpus. Volunteers began hunting these articles, eventually tagging over 500 for review.
In August 2025, the project formalized their findings into a comprehensive guide: "Wikipedia: Signs of AI Writing." The document catalogs the verbal tics, structural patterns, and word choices that give AI away. Think: overuse of "delve," "tapestry," and "multifaceted"; unnecessary summarization at paragraph ends; an odd preference for em-dashes; the strange compulsion to begin sentences with "It's worth noting that."
These patterns weren't discovered through sophisticated statistical analysis. They emerged from thousands of hours of volunteer editors reading bad prose and noticing what made it feel off. The guide represents genuine crowdsourced intelligence—a collective taxonomy of what makes AI writing sound like AI writing.
And now it's a prompt injection.
The Arms Race Gets Recursive
What Chen has done isn't technically novel. People have been prompt-engineering AI to sound more human since ChatGPT launched. But the Humanizer plugin represents something more interesting: the direct weaponization of detection research for evasion purposes.
This creates an uncomfortable feedback loop. Detection efforts produce specific, actionable criteria. Those criteria become training data for evasion. Detection must then identify new patterns. Those patterns get documented and shared. The cycle repeats.
The fundamental problem is that stylistic detection approaches rely on AI having consistent quirks—predictable patterns that emerge from training data and RLHF. But unlike watermarks or cryptographic signatures embedded at the model level, stylistic tells can be addressed through prompting alone. No model weights need to change. No fine-tuning required. Just instructions.
Wikipedia's guide, comprehensive as it is, has become a checklist for what Claude should avoid saying. Every pattern documented is a pattern that can be suppressed.
Why This Matters Beyond Wikipedia
The immediate impact is on platforms trying to maintain human-written content. Wikipedia's volunteer editors now face AI text that has been specifically trained to evade the detection patterns they've spent years identifying. Academic institutions using similar stylistic markers face the same challenge. Any organization relying on "vibes-based" detection—recognizing AI by how it sounds—is increasingly vulnerable.
This doesn't mean stylistic detection is worthless. But it suggests that static detection criteria have a shelf life. The patterns Wikipedia documented were characteristic of AI writing circa 2023-2024. Models have evolved. Prompting techniques have evolved. And now, evasion has been automated.
The deeper issue is who has access to what. Chen's plugin works with Claude Code, Anthropic's coding assistant. It's open source, meaning anyone can use it, modify it, or port it to other models. The barrier to sophisticated evasion has dropped to "can you follow instructions on GitHub."
What Detection Approaches Remain?
Stylistic analysis isn't the only game in town. Several approaches remain more robust:
- Watermarking: Embedding statistical patterns at the token-generation level. Models like Gemini and some versions of GPT have experimented with this. But it requires control over the model, and open-weight models can't be watermarked after release.
- Provenance tracking: Cryptographic systems like C2PA that track content origin. Useful for images and video; harder to apply to text.
- Behavioral analysis: Looking at writing patterns over time, revision histories, consistency across documents. Harder to game with a single prompt.
- Probabilistic detection: Tools that analyze token probability distributions rather than surface patterns. Still an active research area, still imperfect.
None of these are silver bullets. Watermarking requires model providers to implement it and users not to circumvent it. Provenance requires adoption across platforms. Behavioral analysis requires historical data. Probabilistic detection produces false positives.
The Honesty Problem
There's a philosophical question lurking here: should AI be easily detectable? Some argue yes—transparency matters, and people have a right to know when they're reading machine-generated text. Others argue that if AI can help people write better, the provenance shouldn't matter, only the quality.
Chen's plugin doesn't take a position on this. It's a tool. Tools get used for whatever purposes people have. Students will use it to evade plagiarism detection. Marketers will use it to produce more convincing copy. Some people will use it simply because they find AI's default prose style annoying.
The Wikipedia editors who created the original detection guide are volunteers trying to maintain the integrity of a public resource. Their work has been turned into an evasion tool in approximately the time it takes to write a GitHub README. This isn't a failure of their effort—it's a structural feature of how detection and evasion evolve together.
Where This Goes
The Humanizer plugin is a proof of concept, not an endpoint. Expect forks, extensions, and integrations with other models. Expect Wikipedia's WikiProject AI Cleanup to update their detection patterns—and expect those updates to be incorporated into evasion tools in turn.
The question isn't whether AI-generated text can be made undetectable through stylistic mimicry. The answer is increasingly yes, at least for casual inspection. The question is what happens to institutions and platforms that depend on being able to tell the difference.
Wikipedia will keep trying. Schools will keep trying. Hiring managers and journal editors and content platforms will keep trying. But the asymmetry is clear: detection requires sustained effort and expertise; evasion now requires a plugin.
This article was ultrathought.