
Tribev2 Brain Encoding
Run Meta TRIBE v2 in your agent workflow to predict fMRI cortical responses from video, audio, or text stimuli for in-silico neuroscience experiments.
Overview
TRIBE v2 Brain Encoding is an agent skill for the Build phase that runs Meta’s TRIBE v2 multimodal model to predict fMRI brain responses from video, audio, and text stimuli.
Install
npx skills add https://github.com/aradotso/trending-skills --skill tribev2-brain-encodingWhat is this skill?
- Unifies LLaMA 3.2 (text), V-JEPA2 (video), and Wav2Vec-BERT (audio) into one Transformer brain-encoding stack
- Maps multimodal features to fsaverage5 cortical surface (~20k vertices) for fMRI-style prediction
- Pretrained weights via HuggingFace (`facebook/tribev2`) with optional plotting and full training extras
- Quick path: `TribeModel.from_pretrained`, `get_events_dataframe` from video paths, then predict brain responses
- Install tiers: inference-only, `[plotting]` (PyVista/Nilearn), `[training]` (Lightning, W&B)
- fsaverage5 cortical surface with ~20k vertices
- Three install extras: plotting and training dependency groups
Adoption & trust: 711 installs on skills.sh; 31 GitHub stars; 3/3 security scanners passed (skills.sh audits).
What problem does it solve?
You need cortical activity predictions from naturalistic media but don’t want to wire LLaMA, V-JEPA2, Wav2Vec-BERT, and surface mapping yourself.
Who is it for?
Indie ML or neuroscience builders prototyping encoding models, demoing multimodal→brain maps, or scripting tribev2 inference inside Claude Code or Cursor.
Skip if: Teams with no GPU/Python ML stack, builders shipping a generic SaaS with no neuroimaging or encoding research, or users who only need spreadsheet or compliance audits.
When should I use this skill?
Predict brain responses to video, fMRI encoding models, TRIBE v2 brain prediction, multimodal brain encoding, in-silico neuroscience, predict cortical activity from video, tribev2 inference and training.
What do I get? / Deliverables
After the skill runs, your agent can load pretrained TRIBE v2, build stimulus event frames from files, and produce brain-encoding predictions with documented install and training options.
- Runnable TRIBE v2 inference script or notebook flow
- Event dataframe and predicted cortical response outputs
- Optional brain visualization when plotting extras are installed
Recommended Skills
Journey fit
Neuro encoding pipelines are built and run as specialized ML tooling after you have stimuli and a hypothesis—not a journey-wide planning ritual. Agent-tooling fits loading HuggingFace weights, building event dataframes, and invoking inference/training from a coding agent.
How it compares
Use as a packaged neuro-ML inference skill—not a general-purpose web API integration or a financial spreadsheet checker.
Common Questions / FAQ
Who is tribev2-brain-encoding for?
Researchers, ML engineers, and solo builders working on in-silico neuroscience, fMRI encoding, or multimodal stimulus→brain prediction with TRIBE v2.
When should I use tribev2-brain-encoding?
Use it during Build when you predict brain responses to video, train or fine-tune encoding models, or need tribev2 inference from agent-driven Python workflows.
Is tribev2-brain-encoding safe to install?
Review the Security Audits panel on this Prism page before installing; weights pull from HuggingFace and installs may need network, shell, and local cache paths.
SKILL.md
READMESKILL.md - Tribev2 Brain Encoding
# TRIBE v2 Brain Encoding Model > Skill by [ara.so](https://ara.so) — Daily 2026 Skills collection TRIBE v2 is Meta's multimodal foundation model that predicts fMRI brain responses to naturalistic stimuli (video, audio, text). It combines LLaMA 3.2 (text), V-JEPA2 (video), and Wav2Vec-BERT (audio) encoders into a unified Transformer architecture that maps multimodal representations onto the cortical surface (fsaverage5, ~20k vertices). ## Installation ```bash # Inference only pip install -e . # With brain visualization (PyVista & Nilearn) pip install -e ".[plotting]" # Full training dependencies (PyTorch Lightning, W&B, etc.) pip install -e ".[training]" ``` ## Quick Start — Inference ### Load pretrained model and predict from video ```python from tribev2 import TribeModel # Load from HuggingFace (downloads weights to cache) model = TribeModel.from_pretrained("facebook/tribev2", cache_folder="./cache") # Build events dataframe from a video file df = model.get_events_dataframe(video_path="path/to/video.mp4") # Predict brain responses preds, segments = model.predict(events=df) print(preds.shape) # (n_timesteps, n_vertices) on fsaverage5 ``` ### Multimodal input — video + audio + text ```python from tribev2 import TribeModel model = TribeModel.from_pretrained("facebook/tribev2", cache_folder="./cache") # All modalities together (text is auto-converted to speech and transcribed) df = model.get_events_dataframe( video_path="path/to/video.mp4", audio_path="path/to/audio.wav", # optional, overrides video audio text_path="path/to/script.txt", # optional, auto-timed ) preds, segments = model.predict(events=df) print(preds.shape) # (n_timesteps, n_vertices) ``` ### Text-only prediction ```python from tribev2 import TribeModel model = TribeModel.from_pretrained("facebook/tribev2", cache_folder="./cache") df = model.get_events_dataframe(text_path="path/to/narration.txt") preds, segments = model.predict(events=df) ``` ## Brain Visualization ```python from tribev2 import TribeModel from tribev2.plotting import plot_brain_surface model = TribeModel.from_pretrained("facebook/tribev2", cache_folder="./cache") df = model.get_events_dataframe(video_path="path/to/video.mp4") preds, segments = model.predict(events=df) # Plot a single timepoint on the cortical surface plot_brain_surface(preds[0], backend="nilearn") # or backend="pyvista" ``` ## Training a Model from Scratch ### 1. Set environment variables ```bash export DATAPATH="/path/to/studies" export SAVEPATH="/path/to/output" export SLURM_PARTITION="your_slurm_partition" ``` ### 2. Authenticate with HuggingFace (required for LLaMA 3.2) ```bash huggingface-cli login # Paste a HuggingFace read token when prompted # Request access at: https://huggingface.co/meta-llama/Llama-3.2-3B ``` ### 3. Local test run ```bash python -m tribev2.grids.test_run ``` ### 4. Full grid search on Slurm ```bash # Cortical surface model python -m tribev2.grids.run_cortical # Subcortical regions python -m tribev2.grids.run_subcortical ``` ## Key API — TribeModel ```python from tribev2 import TribeModel # Load pretrained weights model = TribeModel.from_pretrained( "facebook/tribev2", cache_folder="./cache" # local cache for HuggingFace weights ) # Build events dataframe (word-level timings, chunking, etc.) df = model.get_events_dataframe( video_path=None, # str path to .mp4 audio_path=None, # str path to .wav text_path=None, # str path to .txt ) # Run prediction preds, segments = model.pr