Meta's FAIR team released TRIBE v2 today, a major upgrade to its brain-encoding model that now predicts neural activity across roughly 70,000 individual brain voxels. That's a 70x jump from the original TRIBE, which operated on just 1,000 coarse brain parcels. The model draws on over 500 hours of fMRI recordings from more than 700 people, a massive expansion from the handful of subjects in the original paper.
TRIBE v2 bills itself as a foundation model for brain encoding. It handles text, audio, and video inputs through a three-stage pipeline (encoding, integration, brain mapping) and supports zero-shot predictions for new subjects, new languages, and new tasks, none of which require additional fMRI scans. Meta claims the model's synthetic brain signals are sometimes cleaner than real fMRI data, though that's a company-reported finding without independent validation yet.
The predecessor won the Algonauts 2025 brain modeling competition, placing first among 263 teams with its 1B-parameter architecture built on Llama 3.2, Wav2Vec2-BERT, and V-JEPA 2. The original model explained roughly 54% of explainable brain variance when tested on subjects watching movies like Friends and The Wolf of Wall Street. TRIBE v2 scales that approach to what Meta calls a "digital twin" of neural activity.
A live demo is available now. The v1 source code is on GitHub, and Meta says v2 is also open-sourced, though a dedicated v2 paper has not appeared on arXiv yet. Applications range from brain-computer interfaces to in silico neuroscience experiments, but commercial use cases remain speculative.
Bottom Line
TRIBE v2 scales brain prediction from 1,000 parcels to 70,000 voxels using fMRI data from 700+ people, with zero-shot generalization to new subjects.
Quick Facts
- ~70,000 brain voxels predicted (up from 1,000 parcels in v1)
- 500+ hours of fMRI recordings from 700+ people
- Zero-shot prediction for new subjects, languages, and tasks
- TRIBE v1 won Algonauts 2025, first among 263 teams
- V1 used Llama 3.2, Wav2Vec2-BERT, and V-JEPA 2 as backbone models




