Conversational agentic-AI for pan-cancer immunotherapy target discovery.
Query the ImmunoVerse atlas of 28,446 antigens across 21 tumor types in plain English. Every claim cross-checked against IEDB, ClinicalTrials.gov, and RCSB PDB — with anti-hallucination validation built into every response.
Top 5 therapeutic targets for neuroblastoma arrow_forward"Top targets for neuroblastoma," "verify QYNPIRTTF in IEDB," "fetch 8EK5 and analyze binding." No SQL, no JSON, no manual joins.
An LLM orchestrates 55+ tools — atlas filtering, IEDB API, ClinicalTrials.gov, RCSB PDB search/fetch, 3D interaction scoring, TesOrAI cross-ref.
Numbers are pre-computed in Python, not invented by the LLM. A post-response checker cross-references every figure against tool output and regenerates if anything drifts.
No prose hallucinations. Every score, every sample count, every safety flag traces back to deterministic Python.
Top tumor-specific candidates (NBL):
1. QYNPIRTTF (PHOX2B) Score 15.30 TSI 6.09 4/16 samples
2. SLLQHLIGL (MYCN) Score 11.42 TSI 4.85 6/16 samples
3. AALLGLLAL (GPR133) Score 9.18 TSI 3.72 3/16 samples
Move from complex SQL queries and CSV manipulation to conversational data exploration without losing technical rigor.
Numbers are computed deterministically in Python and passed to the LLM as structured facts. Every response is cross-checked before the user sees it.
Bridge identified targets to RCSB structures, 3D binding-interaction scoring, IEDB epitope evidence, and ClinicalTrials.gov — in a single conversation.
From atlas filtering all the way to a validated 3D binding analysis — without leaving the chat.
Composite scoring across tumor specificity (TSI), sample prevalence, MS evidence, surface abundance, normal-tissue safety, essentiality, and clinical-stage bonuses. Every score is interpretable down to its components.
Search and fetch crystal structures from RCSB. Score salt bridges, H-bonds, pi-pi, cation-pi, hydrophobic, and disulfide contacts. Render in 3Dmol.js with one-letter labels and distance dashes.
An LLM orchestrates 55+ tools across pre-computed atlases and live external APIs. The model never invents numbers — it composes natural prose around facts produced by deterministic Python.
LiteLLM routes through gpt-4o-mini → gpt-4o → gemini-2.0-flash → gemini-2.5-flash → gpt-4-turbo. Dynamic max_tokens (2048/3072/4096) scales with result size.
generate_response() → tool calls → dispatch → results → post-response validator cross-checks every number against tool output, regenerates if mismatches are found. Trace logger writes JSONL events with code hashing for reproducibility.
Peptide atlas, HLA frequencies, interpretation hints, sample coverage.
User uploads, batch merge, cross-ref against NeoVerse + tumor-specific datasets.
RCSB search/fetch, interaction scoring, PyMOL script generation.
OptiType + PDX disambiguation pipelines on HPC clusters.
Live epitope evidence — T-cell assays, MHC alleles, qualitative results, PubMed refs.
Gene-level trial data. Agent explicitly distinguishes gene-level from peptide-level evidence.
Sequence-motif search for short peptides, metadata enrichment, on-demand downloads.
Searching ImmunoVerse atlas… Found 3 shared candidates. Validating against GTEx and HLA Ligand Atlas…
1. PHOX2B (QYNPIRTTF) — Score 15.30, TSI 6.09 2. MYCN (SLLQHLIGL) — Score 11.42, TSI 4.85 3. GPR133 (AALLGLLAL) — Score 9.18, TSI 3.72
PHOX2B is the strongest candidate. Want to check MHC-I binding for HLA-A*24:02 or search RCSB for a crystal structure?
@article{Li2025ImmunoVerse,
title = {ImmunoVerse: A pan-cancer atlas of constitutive and
induced antigen presentation refines immunotherapeutic
target discovery},
author = {Li, G. and Guzm{\'a}n-Bringas, L. and Sharma, A. and others},
journal= {bioRxiv},
year = {2025},
doi = {10.1101/2025.01.22.634237},
url = {https://www.biorxiv.org/content/10.1101/2025.01.22.634237v2.full}
}
Read on bioRxiv north_east
Anti-hallucinating conversational AI for immunotherapy target discovery.
Launch ImmunoVerse-Chat arrow_forward