Researchers at UNSW Sydney and the Australian National University gave 125 people a simple task: look at a face and say whether it is real or generated by AI. The results, published this month in the British Journal of Psychology, are bleak for anyone who still thinks they can eyeball the difference. Participants with average face-recognition ability performed only slightly better than chance. Even so-called super-recognizers, the roughly 2% of the population with exceptional face-processing skills, managed just a slim margin more.
"The faces created by the most advanced face-generation systems aren't so easily detectable anymore," says Dr James Dunn, a psychologist at UNSW who led the study. That's a polite way of putting it. The more direct reading: we've lost the arms race, and most of us don't realize it yet.
The confidence gap
What struck the researchers wasn't just the low accuracy. It was how confident people remained anyway. Participants consistently overestimated their own ability to spot fakes, clinging to detection strategies that worked against older, cruder generators. The old tells (mangled ears, teeth that dissolved into gums, backgrounds that bled weirdly into hair) are largely gone from the best current systems. But people still look for them.
"A lot of people think they can still tell the difference because they've played with popular AI tools like ChatGPT or DALL-E," says Dr Amy Dawel, a psychologist at ANU and co-author on the paper. Those consumer-facing tools, she points out, don't reflect what the most advanced face-generation pipelines can produce. And that mismatch creates a dangerous overconfidence.
The study recruited 36 super-recognizers alongside 89 controls. Super-recognizers outperformed the control group by about 15%, and by 7% compared to a subset of higher-performing, motivated controls (Cohen's d = 0.55). Those numbers sound decent until you consider what super-recognizers normally do: in tests involving real human faces, they consistently blow everyone else out of the water. Here, the advantage was modest. And there was enough overlap between groups that some ordinary participants actually outscored super-recognizers.
Perfection as the tell
Here's the paradox the paper surfaces. The reason AI faces fool us isn't because they replicate human imperfection convincingly. It's because they skip imperfection entirely. Synthetic faces tend to be unusually symmetrical, well-proportioned, and statistically average in their features. Our brains read those qualities as attractiveness and trustworthiness, not as red flags.
"It's almost as if they're too good to be true as faces," Dawel says, which is a nice line and also the paper's title.
This finding echoes work from 2022 by Nightingale and Farid, whose PNAS paper showed that StyleGAN2-generated faces were rated more trustworthy than photographs of real people. The new UNSW study extends that finding with a different generator (current-generation models, not StyleGAN2) and by adding super-recognizers to the participant pool, which hadn't been done before for this kind of test.
But I find myself wondering about the ecological validity. The study screened out faces with obvious visual flaws before showing them to participants. In the wild, plenty of AI-generated faces still arrive with artifacts. The question isn't just "can the best outputs fool people" but "what fraction of outputs are that good, and is it increasing?" The paper doesn't address that, and it would change the practical threat assessment.
What super-recognizers actually notice
The individual differences data is the most interesting part of the paper, and the part most coverage has skipped. Super-recognizers who did well weren't using the old artifact-hunting strategies. They showed greater sensitivity to the hyper-average quality of AI faces. They could detect, at some level, that a face was too typical, too balanced. Dunn speculates there may be a distinct ability here, separate from traditional face recognition, and the team wants to investigate whether "super-AI-face-detectors" exist as their own category.
A separate study published in November 2025 in Royal Society Open Science by Gray et al. found that five minutes of targeted training improved detection accuracy in super-recognizers specifically, by teaching them to spot rendering artifacts. But that training relied on GAN-specific flaws. As generators improve, those flaws disappear, and the training becomes obsolete. It is a treadmill.
So now what?
The practical upshot is uncomfortable. Visual judgment alone can no longer distinguish real from synthetic faces in controlled conditions, and the gap will only widen. This matters for hiring (fake LinkedIn profiles), romance scams, identity verification, and any context where seeing a face photograph is supposed to mean something.
"There needs to be a healthy level of scepticism," Dunn says. "For a long time, we've been able to look at a photograph and assume we're seeing a real person. That assumption is now being challenged."
If you want to test your own ability, UNSW has put an AI-face demo online. The average score is around 11 out of 20 correct identifications. Don't expect to do much better. And if you do, Dunn's team would like to hear from you.




