Skip to contents

v1.3.0

Changes

  • pcgr_summarise.py: proritize protein-coding BIOTYPE csq (pr201)
  • cpsr.py: expose --pcgrr_conda option to flexibly activate pcgrr env by a non-default pcgrr name
  • docs: update input.Rmd, running.Rmd
  • cpsr_validate_input.py: refactor for efficient custom gene egrep
  • code reformat via autopep8 for annoutils.py, pcgr_vcfanno.py
  • GitHub Actions:
    • bump docker actions setup-buildx-action (v1–v2), build-push-action (v2–v4)
    • use miniforge-variant instead of mamba-version: "*"
    • replace ::set-output since deprecated

v1.2.0

Changes

  • Keep only autosomal, X, Y, M/MT chromosomes
  • Import bcftools as dependency

v1.1.0

Changes

  • Remove Docker command wrappers (note: this does not remove the Docker functionality from PCGR; instead it removes the legacy wrappers that were created in the original PCGR version). This along with a lot of other general changes are summarised in pr193. Of note:
    • --no_docker and --docker_uid CLI arguments are now obsolete.
    • --version CLI argument added for pcgr/cpsr.py
    • declutter repetitive log messages
    • refactor pcgr/cpsr.py script
  • Update documentation and declutter logging; refactor dict creation (pr192).
  • Minor refactor (pr194):
    • switch to using Python’s native os.remove and os.rename for glob cleanup
    • keep decompressed VCF only if --vcf2maf option is specified. The vcf2maf tool does not support compressed VCFs - see issue235.
  • Fix for CLI argument --cna_overlap_pct (pr196).

New Contributors

v1.0.3

  • Date: 2022-05-24
Fixed
  • Bug in clinical trials sorting, #191

v1.0.2

  • Date: 2022-03-30
Fixed
  • JSON output for CPSR, #44

v1.0.1

  • Date: 2022-03-09
Fixed
  • Writing to JSON crashes when size of input VCF is huge (variants in the order of millions). If raw input set (VCF) contains > 500,000 variants, this set will, prior to reporting, be reduced by
      1. exclusion of intergenic and intronic variants, and
      1. exclusion of upstream_gene/downstream_gene variants (if variant set is still above 500,000 after step A)
  • Bug in signature analysis for cases where the input variant set fits to > 18 different aetiologies.

v1.0.0

  • Date: 2022-02-25

  • Data updates: ClinVar, GWAS catalog, GENCODE, CIViC, CancerMine, KEGG, ChEMBL, Open Targets Platform, Disease Ontology, Experimental Factor Ontology

Added
  • Command-line options
    • VEP options
      • --vep_gencode_all - use all GENCODE transcripts during VEP annotation (not only the basic GENCODE set)
      • --prevalence_reference_signatures - set minimum prevalence (percent) for selection of reference signatures included in refitting procedure for a given tumor type
Changed
  • Complete restructure of Python and R components.Installation now relies on two separate conda packages, pcgr (Python component) and pcgrr (R component). Direct Docker support remains, with the Dockerfile simplified to rely exclusively on the installation of the above Conda packages.
Removed
  • VCF validation step. Feedback from users suggested that Ensembl’s vcf-validator was often too stringent so its use has been deprecated. The --no_vcf_validate option remains for backwards compatibility.

v0.9.2

  • Date: 2021-06-30

  • Data updates: ClinVar, GWAS catalog, CIViC, CancerMine, dbNSFP, KEGG, ChEMBL, Disease Ontology/EFO, Open Targets Platform, UniProt KB, GENCODE

  • Software upgrades: R v4.1, Bioconductor v3.13, VEP (104) ++

Changed
  • TOML-based configuration for PCGR is abandoned, all options to PCGR are now configured through command-line parameters
    • NOTE: We recommend to turn on --show_noncoding and --vcf2maf (prevously turned on by default in TOML). For tumor-only runs, we recommend to include --exclude_dbsnp_nonsomatic and exclude_nonexonic
Added
  • Command-line options
    • Previously set in TOML file)
      • Allelic support
        • --tumor_dp_tag
        • --tumor_af_tag
        • --control_dp_tag
        • --control_af_tag
        • --call_conf_tag
      • Tumor-only options
        • --maf_onekg_eur
        • --maf_onekg_amr
        • --maf_onekg_afr
        • --maf_onekg_eas
        • --maf_onekg_sas
        • --maf_onekg_global
        • --maf_gnomad_nfe
        • --maf_gnomad_asj
        • --maf_gnomad_fin
        • --maf_gnomad_oth
        • --maf_gnomad_amr
        • --maf_gnomad_afr
        • --maf_gnomad_eas
        • --maf_gnomad_sas
        • --maf_gnomad_global
        • --exclude_pon
        • --exclude_likely_het_germline
        • --exclude_likely_hom_germline
        • --exclude_dbsnp_nonsomatic
        • --exclude_nonexonic
      • --report_theme
      • --preserved_info_tags (previously custom_tags (TOML))
      • --show_noncoding (previously list_noncoding (TOML))
      • --vcfanno_n_proc (previously n_vcfanno_proc (TOML))
      • --vep_n_forks (previously n_vep_forks (TOML))
      • --vep_pick_order
      • --vep_no_intergenic (previously vep_skip_intergenic (TOML))
      • --vcf2maf
    • New options
      • --report_nonfloating_toc (NEW) - add the TOC at the top of the HTML report, not floating at the left of the document
      • --cpsr_report (NEW) - add a dedicated section in PCGR with main germline findings from CPSR analysis - (use the gzipped JSON output from CPSR as input)
      • --vep_regulatory (NEW) - append regulatory annotations to variants (TF binding sites etc.)
      • --include_artefact_signatures (NEW) - include sequencing artefacts in the reference collection of mutational signatures (COSMIC v3.2)
Fixed
  • Bug in writing (large) report contents to JSON (issue #118)
  • Bug (typo) in merge of clinical evidence items from different sources (CIVIC + CGI) (issue #126)
  • Bug in value box for number of (high-confident) kataegis events - rmarkdown (issue #122)
  • Bug in value box for tumor purity/ploidy -rmarkdown (issue #129)
Removed
  • Command-line options
    • --conf - TOML-based configuration file

v0.9.1

  • Date: 2020-11-30

  • Data updates:

    • ClinVar,
    • GWAS catalog
    • CIViC
    • CancerMine
    • dbNSFP
    • KEGG
    • ChEMBL/DGIdb
    • Disease Ontology, Experimental Factor Ontology
Added
  • added possibility to configure algorithm for TMB calculation, optional argument tmb_algorithm - all coding variants (all_coding) or non-synonymous variants only (nonsyn)
  • R code subject to static analysis with lintr
  • Improved Conda recipe (i.e. meta.yaml) with version pinning of all package dependencies
Changed
  • Removed DisGeNET annotations from output (associations from Open Targets Platform serve same purpose)
  • Version pinning of software dependencies in Dockerfile:
    • All R packages necessary for PCGR is installed using the renv framework, ensuring improved versioning and reproducibility
    • Other tools/utilities and Python libraries that have been version pinned:
      • bedtools, samtools, numpy, cython, scipy, cyvcf2, toml, pandas

v0.9.0rc

  • Date: 2020-09-24

  • Data updates: ClinVar, GWAS catalog, GENCODE, CIViC, CancerMine, UniProt KB, dbNSFP, Pfam, KEGG, Open Targets Platform

  • Software updates: VEP 101

Fixed
  • An extra comma was mistakenly present in the template for tier 2 variants, issue #96
  • Missing protein domain annotations for grch38, issue #116
Changed
  • All arguments to pcgr.py is now non-positional
  • Arguments to pcgr.py are divided into two groups: required and optional
  • Options allelic_support:tumor_dp_min, allelic_support:tumor_af_min, allelic_support:control_dp_min, allelic_support:control_af_max in PCGR configuration file are now optional arguments --tumor_dp_min, --tumor_af_min, --control_dp_min, –control_af_maxincpsr.py`
  • Option mutational_burden:mutational_burden in PCGR configuration file is now optional argument --estimate_tmb in pcgr.py
  • Option msi:msi in PCGR configuration file is now optional argument --estimate_msi_status in pcgr.py
  • Option mutational_signatures:mutational_signatures in PCGR configuration file is now optional argument --estimate_signatures in pcgr.py
  • Options mutational_signatures:mutsignatures_signature_limit, mutational_signatures:mutsignatures_normalization, mutational_signatures:mutsignatures_mutation_limit, mutational_signatures:mutsignatures_cutoff are removed (used for deconstructSigs analysis, which is no longer in use)
  • Optional argument --cna_overlap_pct in pcgr.py replaces cna:cna_overlap_pct in PCGR configuration file
  • Optional argument --logr_gain in pcgr.py replaces cna:logr_gain in PCGR configuration file
  • Optional argument --logr_homdel in pcgr.py replaces cna:logr_homdel in PCGR configuration file
  • Removed mutational_burden:tmb_low_limit and mutational_burden:tmb_intermediate_limit - TMB is no longer interpreted in the context of thresholds
  • Classifications of genes as tumor suppressors/oncogenes are now based on a combination of CancerMine citation count and presence in Network of Cancer Genes
  • Settings section of report is now divived into three:
    • Metadata - sample and sequencing assay
    • Report configuration
Added
  • Optional argument --include_trials in pcgr.py - includes a section with annotated clinical trials for the tumor type in question
  • Optional argument --assay in pcgr.py - designates type of sequencing assay
  • Optional argument --cell_line in pcgr.py - designates runs of tumor cell lines (only for display, not used to configure any analysis)
  • Optional argument --min_mutations_signatures in pcgr.py - minimum number of required mutations for mutational signature analysis with MutationalPatterns
  • Optional argument --all_reference_signatures in pcgr.py - considers all reference signatures during fitting of mutational profile to known signatures
  • Optional argument --estimate_signatures now also includes detection of potential kataegis events (WGS/WES assays only), and rainfall plot in the flexdashboard output
  • The user can now distinguish (through color codes) whether a biomarker has been mapped exactly (nucleotide change) or at a regional level (codon/exon)
  • All variant-associated biomarkers (regardless of assignment to TIER 1/2) are now found in a new section (SNVs/InDels)
  • For copy number amplifications, other putative drug targets in cancer are listed in a new section
  • Detailed documentation of report contents are added to the Documentation section
  • References are updated and all provided with DOI

v0.8.4

  • Date: 2019-11-18

  • Data updates: ClinVar, CIViC, CancerMine, UniProt KB

  • Software updates: VEP 98.3

v0.8.3

  • Date: 2019-10-14

  • Data updates: ClinVar, GWAS catalog, GENCODE, CIViC, CancerMine

  • Software updates: VEP 98.2, vcf2tsv

Fixed
  • More improved mapping between Ensembl transcripts and UniProt accessions (using also RefSeq accessions where available)
Added
  • Possibility to filter evidence items by RATING in interactive data tables
Changed
  • Option target_size_mb in pcgr.py replaces target_size_mb in configuration file, more convenient in terms of configuring runs
  • Option tumor_type in pcgr.py replaces tumor_type in configuration file

v0.8.2

  • Date: 2019-09-29

  • Data updates: ClinVar, GWAS catalog, GENCODE, DiseaseOntology, CIViC, CancerMine, UniProt KB

  • Software updates: VEP 97.3, vcfanno 0.3.2, LOFTEE (VEP plugin) 1.0.3

Fixed
  • Bug in concatenation of clinical evidence items from different sources (CIVIC + CBMDB) (issues #83,#87)
  • Silent variants that coincide with biomarkers reported at codon level are ignored
  • Distinction between clinical evidence items of different origins (somatic + germline)
  • Improved mapping between Ensembl transcripts and UniProt accessions (using also RefSeq accessions where available)
  • Bug in UpSetPlot for cases where filtering produce less than two intersecting sets
Added
  • New field ‘mane’ as criteria for pick order in configuration file (VEP)
  • Sample identifier to copy number annotation output (convenient for concatenation of output from multiple samples)
  • Capturing allelic depth (t_depth, t_ref_count etc.) in vcf2maf output (enhancement #52)
  • Option tumor_only in pcgr.py, replaces vcf_tumor_only in configuration file, more convenient in terms of configuration

v0.8.1

  • Date: 2019-05-22
Added
  • Cancer_NOS.toml as configuration file for unspecified tumor types

v0.8.0

  • Date: 2019-05-20
Fixed
  • Bug in value box for Tier 2 variants (new line carriage) Issue #73
Added
  • Upgraded VEP to v96
    • Skipping the –regulatory VEP option to avoid forking issues and to improve speed (See this issue)
    • Added option to configure pick-order for choice of primary transcript in configuration file
  • Pre-made configuration files for each tumor type in conf folder
  • Possibility to append a CNA plot file (.png format) to the section of the report with Somatic CNAs previous feature request
  • Added possibility to input estimates of tumor purity and ploidy
    • shown as value boxes in Main results
  • Tumor mutational burden is now compared with the distribution of TMB observed for TCGA’s cohorts (organized by primary site)
    • Default target size is now 34Mb (approx. estimate from exome-wide calculation of protein-coding parts of GENCODE)
  • Added flexibility for variant filtering in tumor-only input callsets
    • Added additional options to exclude likely germline variants (both requires the tumor VAF tag to be correctly specified in the input VCF)
      • exclude_likely_hom_germline - removes any variant with an allelic fraction of 1 (100%) - very unlikely somatic event
    • exclude_likely_het_germline - removes any variant with
      • an allelic fraction between 0.4 and 0.6, and
      • presence in dbSNP + gnomAD, and
      • no presence as somatic event in COSMIC/TCGA
    • Added possibility to input PANEL-OF-NORMALS VCF - this to support the many labs that have sequenced a database/pool of healthy controls. This set of variants are utilized in PCGR to improve the variant filtering when running in tumor-only mode. The PANEL-OF-NORMALS annotation work as follows:
      • all variants in the tumor that coincide with any variant listed in the PANEL-OF-NORMALS VCF is appended with a PANEL_OF_NORMALS flag in the query VCF with tumor variants.
    • If configuration parameter exclude_pon is set to True in tumor_only runs, all variants with a PANEL_OF_NORMALS flag are filtered/excluded
  • For tumor-only runs, added an UpSet plot showing how different filtering sources (gnomAD, 1KG Project, panel-of-normals etc) contribute in the germline filtering procedure
  • Variants in Tier 3 / Tier 4 / Noncoding are now sorted (and color-coded) according to the target (gene) association score to the cancer phenotype, as provided by the OpenTargets Platform
  • Added annotation of TCGA’s ten oncogenic signaling pathways
  • Added EXONIC_STATUS annotation tag (VCF and TSV)
    • exonic denotes all protein-altering AND cannonical splicesite altering AND synonymous variants, nonexonic denotes the complement
  • Added CODING_STATUS annotation tag (VCF and TSV)
    • coding denotes all protein-altering AND cannonical splicesite altering, noncoding denotes the complement
  • Added SYMBOL_ENTREZ annotation tag (VCF)
    • Official gene symbol from NCBI EntreZ (SYMBOL provided by VEP can sometimes be non-official/alias (i.e. for GENCODE v19/grch37))
  • Added SIMPLEREPEATS_HIT annotation tag (VCF and TSV)
    • Variant overlaps UCSC simpleRepeat sequence repeat track - used for MSI prediction
  • Added WINMASKER_HIT annotation tag (VCF and TSV)
    • Variant overlaps UCSC windowmaskerSdust sequence repeat track - used for MSI prediction
  • Added PUTATIVE_DRIVER_MUTATION annotation tag (VCF and TSV)
    • Putative cancer driver mutation discovered by multiple approaches from 9,423 tumor exomes in TCGA. Format: symbol:hgvsp:ensembl_transcript_id:discovery_approaches
  • Added OPENTARGETS_DISEASE_ASSOCS annotation tag (VCF and TSV)
    • Associations between protein targets and disease based on multiple lines of evidence (mutations,affected pathways,GWAS, literature etc). Format: CUI:EFO_ID:IS_DIRECT:OVERALL_SCORE
  • Added OPENTARGETS_TRACTABILITY_COMPOUND annotation tag (VCF and TSV)
    • Confidence for the existence of a modulator (small molecule) that interacts with the target (protein) to elicit a desired biological effect
  • Added OPENTARGTES_TRACTABILITY_ANTIBODY annotation tag (VCF and TSV)
    • Confidence for the existence of a modulator (antibody) that interacts with the target (protein) to elicit a desired biological effect
  • Added CLINVAR_REVIEW_STATUS_STARS annotation tag
    • Rating of the ClinVar variant (0-4 stars) with respect to level of review
Changed
Removed
  • Original tier model ‘pcgr’

v0.7.0

  • Date: 2018-11-27
Fixed
  • Bug in assignment of variants to tier1/tier2 Issue #61
  • Missing config option for maf_gnomad_asj in TOML file (also setting operator to <=) Issue #60
  • Bug in new CancerMine oncogene/tumor suppressor annotation Issue #53
  • vcfanno fix for empty Description (upgrade to vcfanno v0.3.1 Issue #49)
  • Bug in message showing too few variants for MSI prediction, Issue #55
  • Bug in appending of custom VCF tags
    • Still unsolved: how to disambiguate identical FORMAT and INFO tags in vcf2tsv
  • Bug in SCNA value box display for multiple copy number hits (Issue #47)
  • Bug in vcf2tsv (handling INFO tags encoded with ‘Type = String’, Issue #39)
  • Bug in search of UniProt functional features (BED feature regions spanning exons are now handled)
  • Stripped off HTML elements (TCGA_FREQUENCY, DBSNP) in TSV output
  • Some effect predictions from dbNSFP were not properly parsed (e.g. multiple prediction entries from multiple transcript isoforms), these should now be retrieved correctly
  • Removed ‘COSM’ prefix in COSMIC mutation links
  • Bug in retrieval of splice site predictions from dbscSNV
Added
  • Possibility to run PCGR in a non-Docker environment (e.g. using the –no-docker option). Thanks to an excellent contribution by Vlad Saveliev, Issue #35
    • Added possibility to add docker user-id
  • Possibility for MAF file output (converted with vcf2maf), must be configured by the user in the TOML file (i.e. vcf2maf = true, Issue #17)
  • Possibility for adding custom VCF INFO tags to PCGR output files (JSON/TSV), must be configured by the user in the TOML file (i.e. custom_tags)
  • Added MUTATION_HOTSPOT_CANCERTYPE in data tables (i.e. listing tumor types in which hotspot mutations have been found)
  • Included the ‘rs’ prefix for dbSNP identifiers (HTML and TSV output)
  • Individual entries/columns for variant effect predictions:
    • Individual algorithms: SIFT_DBNSFP, M_CAP_DBNSFP, MUTPRED_DBNSFP, MUTATIONTASTER_DBNSFP, MUTATIONASSESSOR_DBNSFP, FATHMM_DBNSFP, FATHMM_MKL_DBNSFP, PROVEAN_DBNSFP
    • Ensemble predictions (META_LR_DBNSFP), dbscSNV splice site predictions (SPLICE_SITE_RF_DBNSFP, SPLICE_SITE_ADA_DBNSFP)
  • Upgraded samtools to v1.9 (makes vcf2maf work properly)
  • Added Ensembl gene/transcript id and corresponding RefSeq mRNA id to TSV/JSON
  • Added for future implementation:
    • SeqKat + karyoploteR for exploration of kataegis/hypermutation
    • CELLector - genomics-guided selection of cancer cell lines
  • Upgraded VEP to v94
Changed
  • Changed CANCER_MUTATION_HOTSPOT to MUTATION_HOTSPOT
  • Moved from TSGene 2.0 to CancerMine for annotation of tumor suppressor genes and proto-oncogenes
    • A minimum of n=3 citations were required to include literatured-mined tumor suppressor genes and proto-oncogenes from CancerMine

v0.6.2.1

  • Date: 2018-05-14
Fixed
  • Bug in copy number annotation (broad/focal)

v0.6.2

  • Date: 2018-05-09
Fixed
  • Bug in copy number segment display (missing variable initalization, Issue #34))
  • Typo in gnomAD filter statistic (fraction, Issue #31)
  • Bug in mutational signature analysis for grch38 (forgot to pass BSgenome object, Issue #27)
  • Missing proper ASCII-encoding in vcf2tsv conversion, Issue #
  • Removed ‘Noncoding mutations’ section when no input VCF is present
  • Bug in annotation of copy number event type (focal/broad)
  • Bug in copy number annotation (missing protein-coding transcripts)
  • Updated MSI prediction (variable importance, performance measures)
Added
  • Genome assembly is appended to every output file
  • Issue warning for copy number segment that goes beyond chromosomal lengths of specified assembly (segments will be skipped)
  • Added missing subtypes for ‘Skin_Cancer_NOS’ in the cancer phenotype dataset

v0.6.1

  • Date: 2018-05-02
Fixed
  • Bug in tier assignment ‘pcgr_acmg’ (case for no variants in tier1,2,3)
  • Bug in tier assignment ‘pcgr_acmg’ (no tumor type specified, evidence items with weak support detected)
  • Bug: duplicated variants in ‘Tier 3’ resulting from genes encoded with dual roles as tumor suppressor genes/oncogenes
  • Bug: duplicated variants in ‘Tier 1/Noncoding variants’ resulting from rare cases of noncoding variants occurring in Tier 1 (synonymous variants with biomarker role)

v0.6.0

  • Date: 2018-04-25
Added
  • New argument in pcgr.py
    • assembly (grch37/grch38)
  • New option in pcgr.py
    • –basic - run comprehensive VCF annotation only, skip report generation and additional analyses
  • New sections in HTML report
    • Settings and annotation sources - now also listing key PCGR configuration settings
    • Main findings - Six value boxes indicating the main findings of clinical relevance
  • New configuration options
    • tier_model(string) - choice between pcgr_acmg and pcgr
    • mutational_burden - set TMB tertile limits
      • tmb_low_limit (float)
      • tmb_intermediate_limit (float)
    • tumor_type - choose between 34 tumor types/classes:
      • Adrenal_Gland_Cancer_NOS (logical)
      • Ampullary_Carcinoma_NOS (logical)
      • Biliary_Tract_Cancer_NOS (logical)
      • Bladder_Urinary_Tract_Cancer_NOS (logical)
      • Blood_Cancer_NOS (logical)
      • Bone_Cancer_NOS (logical)
      • Breast_Cancer_NOS (logical)
      • CNS_Brain_Cancer_NOS (logical)
      • Colorectal_Cancer_NOS (logical)
      • Cervical_Cancer_NOS (logical)
      • Esophageal_Stomach_Cancer_NOS (logical)
      • Head_And_Neck_Cancer_NOS (logical)
      • Hereditary_Cancer_NOS (logical)
      • Kidney_Cancer_NOS (logical)
      • Leukemia_NOS (logical)
      • Liver_Cancer_NOS (logical)
      • Lung_Cancer_NOS (logical)
      • Lymphoma_Hodgkin_NOS (logical)
      • Lymphoma_Non_Hodgkin_NOS (logical)
      • Ovarian_Fallopian_Tube_Cancer_NOS (logical)
      • Pancreatic_Cancer_NOS (logical)
      • Penile_Cancer_NOS (logical)
      • Peripheral_Nervous_System_Cancer_NOS (logical)
      • Peritoneal_Cancer_NOS (logical)
      • Pleural_Cancer_NOS (logical)
      • Prostate_Cancer_NOS (logical)
      • Skin_Cancer_NOS (logical)
      • Soft_Tissue_Cancer_NOS (logical)
      • Stomach_Cancer_NOS (logical)
      • Testicular_Cancer_NOS (logical)
      • Thymic_Cancer_NOS (logical)
      • Thyroid_Cancer_NOS (logical)
      • Uterine_Cancer_NOS (logical)
      • Vulvar_Vaginal_Cancer_NOS (logical)
    • mutational_signatures
      • mutsignatures_cutoff (float) - discard any signature contributions with a weight less than the cutoff
    • cna
      • transcript_cna_overlap (float) - minimum percent overlap between copy number segment and transcripts (average) for tumor suppressor gene/proto-oncogene to be reported
    • allelic_support
      • If input VCF has correctly formatted depth/allelic fraction as INFO tags, users can add thresholds on depth/support that are applied prior to report generation
        • tumor_dp_min (integer) - minimum sequencing depth for variant in tumor sample
        • tumor_af_min (float) - minimum allelic fraction for variant in tumor sample
        • normal_dp_min (integer) - minimum sequencing depth for variant in normal sample
        • normal_af_max (float) - maximum allelic fraction for variant in normal sample
    • visual
      • report_theme (string) - visual theme of report (Bootstrap)
    • other
      • vcf_validation (logical) - keep/skip VCF validation by vcf-validator
  • New output file - JSON output of HTML report content
  • New INFO tags of PCGR-annotated VCF
    • CANCER_PREDISPOSITION
    • PFAM_DOMAIN
    • TCGA_FREQUENCY
    • TCGA_PANCANCER_COUNT
    • ICGC_PCAWG_OCCURRENCE
    • ICGC_PCAWG_AFFECTED_DONORS
    • CLINVAR_MEDGEN_CUI
  • New column entries in annotated SNV/InDel TSV file:
    • CANCER_PREDISPOSITION
    • ICGC_PCAWG_OCCURRENCE
    • TCGA_FREQUENCY
  • New column in CNA output
    • TRANSCRIPTS - aberration-overlapping transcripts (Ensembl transcript IDs)
    • MEAN_TRANSCRIPT_CNA_OVERLAP - Mean overlap (%) betweeen gene transcripts and aberration segment
Removed
  • Elements of databundle (now annotated directly through VEP):
    • dbsnp
    • gnomad/exac
    • 1000G project
  • INFO tags of PCGR-annotated VCF
    • DBSNPBUILDID
    • DBSNP_VALIDATION
    • DBSNP_SUBMISSIONS
    • DBSNP_MAPPINGSTATUS
    • GWAS_CATALOG_PMID
    • GWAS_CATALOG_TRAIT_URI
    • DOCM_DISEASE
  • Output files
    • TSV files with mutational signature results and biomarkers (i.e. sample_id.pcgr.snvs_indels.biomarkers.tsv and sample_id.pcgr.mutational_signatures.tsv)
      • Data can still be retrieved - now from the JSON dump
    • MAF file
      • The previous MAF output was generated in a custom fashion, a more accurate MAF output based on https://github.com/mskcc/vcf2maf will be incorporated in the next release
Changed
  • HTML report sections
    • Tier statistics and Variant statistics are now grouped into the section Tier and variant statistics
    • Tier 5 is now Noncoding mutations (i.e. not considered a tier per se)
    • Sliders for allelic fraction in the Global variant browser are now fixed from 0 to 1 (0.05 intervals)