Basic variant consequence annotation
- VEP - Variant Effect Predictor release 112 (GENCODE v46 as gene reference database (v19 for grch37))
Insilico predictions of effect of coding variants
- dBNSFP - database of non-synonymous functional predictions (v4.8, June 2024)
Variant frequency databases
- gnomAD - germline variant frequencies exome-wide (r2.1, October 2018)
- dbSNP - database of short genetic variants (build 154)
- Cancer Hotspots - a resource for statistically significant mutations in cancer (v2, 2017)
- TCGA - somatic mutations discovered across 33 tumor type cohorts (release 41.0, August 2024)
Protein domains/functional features
- UniProt/SwissProt KnowledgeBase - resource on protein sequence and functional information (2024_04)
- Pfam - database of protein families and domains (v37.0)
Knowledge resources on gene and protein targets
- CancerMine - Literature-mined database of tumor suppressor genes/proto-oncogenes (v50, March 2023)
- Open Targets Platform - Database on disease-target associations, molecularly targeted drugs and tractability aggregated from multiple sources (literature, pathways, mutations) (2024.06)
Notes on variant annotation datasets
Data quality
Genomic biomarkers
Genomic biomarkers utilized in PCGR are currently limited to the following:
- Evidence items for specific markers in CIViC must be accepted (submitted evidence items are not considered or shown)
- Markers reported at the exact variant level (e.g. BRAF p.V600E, MET c.3028+1G>T, g.7:140753336A>T)
- Markers reported at the codon level (e.g. KRAS p.G12)
- Markers reported at the exon level (e.g. KIT exon 11 mutation, EGFR exon 19 deletion)
- Markers reported at the gene level (e.g. BRAF mutation, TP53 loss-of-function mutation, BRCA1 oncogenic mutation)
- Within the Cancer bioMarkers database (CGI), only biomarkers curated from FDA/NCCN guidelines, scientific literature, and clinical trials are included (biomarkers collected from conference abstracts etc. are not included)
- Copy number gains/losses
- RNA fusion and gene expression biomarkers are included in the PCGR reference databundle, but are not currently utilized in the PCGR biomarker matching procedure