Skip to contents

Frequently asked questions regarding PCGR usage and functionality:

1. I do not see any data related to allelic depth/support in my report. I thought that PCGR can grab this information automatically from my VCF?

Answer: VCF variant genotype data (i.e. AD/DP) is something that you as a user need to specify explicitly when running PCGR. In our experience, there is currently no uniform way that variant callers format these types of data (allelic fraction/depth, tumor/normal) in the VCF, and this makes it very challenging for PCGR to automatically grab this information from any VCF. Please take a careful look at the example VCF files (examples folder) that comes with PCGR for how PCGR expects this information to be formatted, and make sure your VCF is formatted accordingly. There is also an in-depth explanation on the matter described here

2. Is it possible to utilize PCGR for analysis of multiple samples?

Answer: As the name of the tool implies, PCGR was developed for the detailed analysis of individual tumor samples. However, if you take advantage of the different outputs from PCGR, it can also be utilized for analysis of multiple samples. First, make sure your input files are organized per sample (i.e. one VCF file per sample, one CNA file per sample), so that they can be fed directly to PCGR. Now, once all samples have been processed with PCGR, note that all the tab-separated output files (i.e. annotated SNVs, gene copy numbers, fusions) contain the sample identifier, which enable them to be aggregated and suitable for a downstream multi-sample analysis. Also note the multi-sheet Excel workbook, which contains numerous outputs from PCGR, and can be processed to aggregate findings across samples.

3. I do not see the expected transcript-specific consequence for a particular variant. In what way is the primary variant consequence established?

Answer: PCGR relies upon VEP for consequence prioritization, in which a specific transcript-specific consequence is chosen as the primary variant consequence. In the PCGR configuration file, you may customise how this is chosen by changing the order of criteria applied when choosing a primary consequence block - parameter vep_pick_order

4. Is it possible to use RefSeq as the underlying gene transcript model in PCGR?

Answer: PCGR uses GENCODE as the primary gene transcript model, but we provide cross-references to corresponding RefSeq transcripts when this is available.

5. I have a VCF with structural variants (SVs) detected in my tumor sample, can PCGR process those as well?

Answer: This is currently not supported as input for PCGR, but is something we want to incorporate in the future. We have a skeleton of SV support working for CPSR, focusing to support large, multi-exon deletions

6. Is it possible to see all the individual cancer subtypes that belong to each of the 30 different tumor sites?

Answer: Yes, see an overview of phenotypes associated with primary tumor sites. See also the related GitHub repository phenOncoX

7. Is it possible for the users to update the data bundle to get the most recent versions of all underlying data sources?

Answer: As of now, the data bundle is updated only with each release of PCGR. The data harmonization pipeline of knowledge databases in PCGR contain numerous and complex procedures, with several cleaning, quality control, and re-formatting steps, and is semi-automated in its present form. The versions of all databases and key software elements are outlined in each PCGR report.

8. When OncoKB is enabled, I sometimes see variants where PCGR’s internal LOSS_OF_FUNCTION flag disagrees with OncoKB’s mutation effect (e.g. OncoKB says “Likely Loss-of-function” but LOSS_OF_FUNCTION is FALSE, or vice versa). Why?

Answer: The two annotations are derived from independent evidence sources and should be interpreted as complementary rather than contradictory. PCGR’s internal LOSS_OF_FUNCTION flag is rule-based: it fires on variant consequence types that are mechanistically expected to disrupt gene function (stop-gained, frameshift, splice site disruption assessed by MaxEntScan, etc.), irrespective of any curated knowledge about the specific variant. OncoKB’s mutation effect annotation, by contrast, is manually curated and may draw on functional assay data, structural evidence, or recurrence in cancer cohorts — and therefore can assign loss-of-function status to variants (e.g. certain missense changes) that PCGR’s consequence-based logic would not flag, or conversely may lack an entry for a variant that PCGR’s rules would classify as LoF. When OncoKB is active and provides a mutation effect, that annotation takes precedence in the two-hit candidate detection logic alongside the internal flag (either source can qualify a variant as a LoF hit). For detailed interpretation, cross-check both columns in the SNV/InDel table.

9. When running PCGR with OncoKB enabled but without specifying --oncokb_oncotree_code, I see a variant assigned OncoKB level 3B (e.g. for Bladder Cancer). If I re-run with a more specific OncoTree code (e.g. Urethral Urothelial Carcinoma), the same variant is assigned level 1. Why does a less specific tumor type yield a lower evidence level?

Answer: OncoKB evidence levels are tied to the specific tumor type context in which a biomarker–drug association has been validated. When PCGR derives the OncoTree code from --tumor_site alone (without --oncokb_oncotree_code), it maps to a broad, site-level code (e.g. BLCA for Bladder Cancer). If the highest-confidence evidence in OncoKB is annotated under a more specific subtype (e.g. Urethral Urothelial Carcinoma), the broad code will not match that entry and OncoKB instead returns a lower level reflecting the closest match it can find. Providing a more specific OncoTree code via --oncokb_oncotree_code allows OncoKB to resolve the exact subtype match and return the correct, higher evidence level. For tumor types where subtype granularity matters clinically, we therefore recommend explicitly setting --oncokb_oncotree_code rather than relying on the site-level default. A full list of OncoTree codes is available at oncotree.mskcc.org.

10. Does PCGR support the ESMO Scale for Clinical Actionability of Molecular Targets (ESCAT)?

Answer: ESCAT is not currently implemented in PCGR, primarily since an automated implementation is considerably non-trivial. The ESCAT guidelines have relatively low specificity in certain areas - often leaving room for subjective judgements (e.g. “clinically meaningful improvement of a survival endpoint in prospective, randomised clinical trials”). This observation is reflected in a relatively low inter-rater institutional agreement of ESCAT-based variant rankings (see Lebedeva et al., Ann Oncol 2024). We continue to monitor developments in this space (e.g. Kordes et al., medRxiv 2026) and hope to offer ESCAT support in forthcoming releases.

11. I notice that PCGR’s internal oncogenicity classification sometimes assigns a variant “Likely Oncogenic” while OncoKB labels the same variant “Oncogenic”. Why does this discrepancy occur, given that PCGR uses OncoKB data as one of its inputs?

Answer: PCGR’s internal oncogenicity classification implements the joint VICC/CGC/ClinGen guidelines, which combine evidence from multiple sources, and weigh each source according to a defined rule-based scoring framework. Crucially, when available, OncoKB data feeds into that framework among several other sources (e.g. hotspot status, functional impact, population frequency), and the resulting score may not reach the threshold required for a definitive “Oncogenic” call even when OncoKB itself has curated the variant as such. OncoKB’s own classifications, on the other hand, are based on manual expert curation that can integrate functional assay results, structural biology evidence, and broader literature context in ways that the rule-based VICC/CGC/ClinGen algorithm cannot fully capture algorithmically. PCGR’s classification thus tends to be more conservative: a variant that a curator has confidently labelled “Oncogenic” in OncoKB may accumulate evidence to reach only “Likely Oncogenic” under the VICC/CGC/ClinGen scoring scheme. Both annotations are reported in PCGR (columns ONCOGENICITY for the internal call and ONCOGENICITY_OKB when OncoKB is enabled), and users are encouraged to treat them as complementary.

12. In the Excel workbook, the biomarker sheets contain a BM_ACTIONABILITY_SUPPORT column with values such as tier-defining or additional. Does this mean that prognostic, diagnostic, and resistance markers also influence variant tiering?

Answer: No. Variant tiering in PCGR is (as of v2.3.0) driven exclusively by treatment sensitivity evidence — the TIER column values T1, T2, T3, and T4 reflect only drug sensitivity biomarkers. The BM_ACTIONABILITY_SUPPORT column (tier-defining / additional) describes the role of a given biomarker within its own evidence category: for example, a prognostic biomarker may be the strongest (tier-defining) or a corroborating (additional) piece of evidence for that category. Prognostic (PP1/PB1), diagnostic (D1/D2), and resistance (R1/R2) markers are listed in the biomarker sheets for completeness and transparency, but they do not contribute to the T1T4 variant rank. To identify which biomarkers actually drove the tier assignment of a given variant, focus on rows where TIER is T1T3 and the evidence type is treatment sensitivity.