LLMs Can Unmask Anonymous Users From Posts Alone

Researchers from ETH Zurich and Anthropic have published a paper showing that large language models can identify anonymous internet users from their posts alone, correctly matching two-thirds of pseudonymous Hacker News accounts to real-world identities in one test. The research paper, posted to arXiv on February 18, describes an automated pipeline that extracts identity signals from unstructured text, searches for candidate matches, and reasons its way to a conclusion.

The cost for the entire Hacker News experiment, which targeted 338 users? Around $2,000. Roughly six dollars a person.

How it works (and why old methods failed)

The pipeline has three core stages. First, an LLM summarizes a user's post history into structured identity features: where they seem to live, what they do for work, niche interests, conferences mentioned. Then semantic embeddings narrow the field to roughly 100 candidate matches. Finally, a reasoning step evaluates those candidates and either picks one or abstains.

That last part matters. The system's confidence calibration is what makes the numbers so stark. Of the 338 Hacker News users tested, the agent correctly identified 226 of them, a 67% recall rate. When it chose to make a guess rather than abstain, it was right 90% of the time. Only 25 identifications were wrong; 86 times the system said nothing, which is arguably the smart move.

Classical deanonymization techniques, like those from the famous Netflix Prize attack, scored near zero recall across every test the researchers ran. The comparison is almost unfair. Those older methods relied on structured data and manual feature engineering. LLMs just read what you wrote and figure it out.

Reddit and redacted transcripts

The Hacker News results are the headline number, but the Reddit experiments might be more unsettling for the average user. The researchers split individual Reddit accounts into temporal halves (posts before and after a cutoff date) and tried to link them back together. Even with a full year of changed interests and different conversations separating the two halves, the system achieved up to 68% recall at 90% precision, according to the paper's abstract. Netflix-style baselines? Near zero again.

Then there's the real-world test. Anthropic's Interviewer dataset contains anonymized conversations with 125 scientists about their use of AI. The transcripts were redacted specifically to protect identity. The agent still managed to re-identify 9 of them, as lead author Simon Lermen detailed in a blog post. That's a lower success rate, sure, but these were transcripts someone actively tried to anonymize. And the researchers caveat that number relies on manual verification since no ground truth exists for that dataset.

Why guardrails won't help much

"We demonstrate that large language models fundamentally change this calculus, enabling fully automated deanonymization attacks that operate on unstructured text at scale," the researchers write. That phrasing is dry, but the implication is not. Lermen, an AI engineer at MATS Research, was more direct in his post: the pipeline's subtasks all look innocent in isolation. Summarize a profile. Compute embeddings. Rank candidates. No single API call screams surveillance.

The researchers themselves sound pessimistic about defenses. Refusal guardrails can be bypassed through task decomposition, and none of these mitigations apply to open-source models at all. Platform-level rate limiting and scraping detection can raise costs, but the underlying capability exists and gets cheaper with every model generation.

The paper notes that the attack scales gracefully as candidate pools grow into the tens of thousands, with The Register reporting on the research the same day. The team's scaling analysis suggests these methods could extend to entire platforms given sufficient compute.

So what now

Lermen's advice to pseudonymous users is blunt: each piece of specific information you share, your city, your job, a conference you attended, a niche hobby, narrows down who you could be. The combination is often a unique fingerprint. If a team of dedicated investigators could figure out your identity from your posts, an LLM agent probably can too. And the cost keeps dropping.

The paper was approved by ETH Zurich's Ethics Review Board. The researchers chose not to release code or data, citing the risk of enabling exactly the kind of attacks they describe. That restraint is worth noting, though it won't stop anyone determined to replicate the approach from doing so with publicly available models.

The six co-authors, spanning MATS Research, ETH Zurich, and Anthropic, plan to continue studying defenses. For now, the practical takeaway is uncomfortable: persistent pseudonyms on platforms with public post histories are weaker protection than most people assume.

ETH Zurich Researchers Show LLMs Can Unmask Anonymous Online Users at Scale

How it works (and why old methods failed)

Reddit and redacted transcripts

Why guardrails won't help much

So what now

Liza Chan

Related Articles

Pentagon Gives Anthropic Until Friday to Drop AI Safeguards or Face Blacklist

Anthropic Catches DeepSeek, Moonshot, and MiniMax Extracting Claude at Industrial Scale

Anthropic Study: The Better AI Output Looks, the Less People Bother Checking It

Stay Ahead of the AI Curve