Skip to contents

Quick Installation

Conda

  • If you know what conda is, you only need to run the following commands in order to install the PCGR software requirements:
PCGR_VERSION="1.4.1"
PCGR_REPO="https://raw.githubusercontent.com/sigven/pcgr/v${PCGR_VERSION}/conda/env/lock/"
PLATFORM="linux" # or "osx"

# mamba is a much faster alternative to conda
conda install mamba -c conda-forge

mamba create --file ${PCGR_REPO}/pcgr-${PLATFORM}-64.lock --prefix ./pcgr
mamba create --file ${PCGR_REPO}/pcgrr-${PLATFORM}-64.lock --prefix ./pcgrr

# you need to specify the directory of the conda env when using --prefix
conda activate ./pcgr

# test that it works
pcgr --version

Data

  • For downloading the data bundle, see STEP 1 further below.

Docker

  • If you know what Docker is, instead of using the above conda method you can jump straight to the PCGR Docker setup section.

Detailed Installation

PCGR requires a data bundle that contains the reference data, sample inputs (e.g. somatic variants in a VCF), and an output directory to output the results to.

Here’s an example scenario that will be used in the following sections:

  • data bundle downloaded in /Users/you/dir1/data;
  • sample inputs at /Users/you/dir2/pcgr_inputs;
  • output goes to /Users/you/dir3/pcgr_outputs (make sure this directory exists);
  • your PCGR codebase is installed in /Users/you/dir4/PCGR;

STEP 1: Download data bundle

Download and unpack the human assembly-specific data bundle:

GENOME="grch38" # or "grch37"
BUNDLE_VERSION="20220203"
BUNDLE="pcgr.databundle.${GENOME}.${BUNDLE_VERSION}.tgz"

wget http://insilico.hpc.uio.no/pcgr/${BUNDLE}
gzip -dc ${BUNDLE} | tar xvf -

STEP 2: Download PCGR GitHub repository

Download and unpack the latest software release from https://github.com/sigven/pcgr/releases.

Alternatively if you have git installed, you can do:

PCGR_VERSION="1.4.1"
OUTPUT_DIRECTORY="PCGR"

git clone \
    -b "v${PCGR_VERSION}" \
    --depth 1 \
    https://github.com/sigven/pcgr.git \
    "${OUTPUT_DIRECTORY}"

Note that the --depth 1 option ensures you only clone the specific version, and not the entire git history of the repo.

STEP 3: Set up Conda or Docker

Step 3 depends on if you want to use Conda or Docker:

Option 1: Conda

a) Miniconda and Mamba

  1. Download and install the Miniconda installer from https://docs.conda.io/en/latest/miniconda.html:
  • Make sure to download the Linux or MacOSX script according to which platform you’re currently on.
  • Run bash miniconda.sh and follow the prompts (it should be okay to accept the defaults, unless you want to choose a different installation location than the default ~/miniconda3).
  • Exit your current terminal session and open a new one. You should now notice something like a (base) string as a prefix in your terminal prompt. This means that you’re in the base conda environment, and you’re ready to start installing the conda environments for PCGR.
  1. Install Mamba in this base environment, which is a very fast conda package installer.
PLATFORM="MacOSX" # or "Linux"
MINICONDA_URL="https://repo.continuum.io/miniconda/Miniconda3-latest-${PLATFORM}-x86_64.sh"
wget ${MINICONDA_URL} -O miniconda.sh && chmod +x miniconda.sh
bash miniconda.sh
# exit terminal and open new one - you should now see:
(base) $
(base) $ conda install -c conda-forge mamba

(base) $ mamba --version
mamba 0.19.1
conda 4.11.0

b) Create PCGR conda environments

The conda/env/lock directory in the PCGR codebase contains two .lock files which can be used to create the required conda environments for the Python component (pcgr) and the R components (pcgrr (and cpsr)). We install the conda dependencies for these two environments in the local conda/env directory in the following example:

cd /Users/you/dir4/PCGR
PLATFORM="osx-64" # or "linux-64"
PCGR_CONDA_ENV_DIR="./conda/env"

mamba create --prefix ${PCGR_CONDA_ENV_DIR}/pcgr --file ${PCGR_CONDA_ENV_DIR}/lock/pcgr-${PLATFORM}.lock
mamba create --prefix ${PCGR_CONDA_ENV_DIR}/pcgrr --file ${PCGR_CONDA_ENV_DIR}/lock/pcgrr-${PLATFORM}.lock

## Alternatively, for installing in your central conda directory, use the following:
# mamba create --name pcgr --file ${PCGR_CONDA_ENV_DIR}/lock/pcgr-${PLATFORM}.lock
# mamba create --name pcgrr --file ${PCGR_CONDA_ENV_DIR}/lock/pcgrr-${PLATFORM}.lock

## For MacOS M1, you need to have 'CONDA_SUBDIR=osx-64' before the mamba command, i.e.:
# CONDA_SUBDIR=osx-64 mamba create --prefix [...] --file [...]
## See https://github.com/conda-forge/miniforge/issues/165#issuecomment-860233092

The above process takes 10-15min when installing from scratch. In the end, you can confirm your conda environments have been installed correctly (notice how the paths are different to the base env installation after using the --prefix option above):

$ (base) conda env list
# conda environments:
#
base    *  /Users/you/miniconda3
pcgr       /Users/you/dir4/PCGR/conda/env/pcgr
pcgrr      /Users/you/dir4/PCGR/conda/env/pcgrr

c) Activate pcgr conda environment

You need to activate the PCGR/conda/env/pcgr conda environment, and test that it works correctly with e.g. pcgr --version:

$ cd /Users/you/dir4/PCGR
(base) $ conda activate ./conda/env/pcgr
# note how the full path to the locally installed conda environment is now displayed

(/Users/you/dir4/PCGR) $ which pcgr
/Users/you/dir4/PCGR/conda/env/pcgr/bin/pcgr

(/Users/you/dir4/PCGR) $ pcgr --version
pcgr X.X.X

(/Users/you/dir4/PCGR) $ which pcgrr.R
/Users/you/dir4/PCGR/conda/env/pcgr/bin/pcgrr.R

You should now be all set up to run PCGR! Continue on to an example run.

Option 2: Docker

a) Install Docker

For installing Docker, follow the instructions at https://docs.docker.com/engine/install/ for your Linux or MacOSX machine. NOTE: We have not been able to perform enough testing on the Windows platform, and we have received feedback that particular versions of Docker/Windows do not work with PCGR (an example being mounting of data volumes).

  • Test that Docker is running, e.g. by typing docker ps or docker images in the terminal window.
  • Adjust the computing resources dedicated to the Docker, i.e.: Memory of minimum 5GB, CPUs minimum 4 (see e.g. how to do that on MacOSX).

b) Download PCGR Docker Image

  • Pull the PCGR Docker image from DockerHub (approx 5.7Gb) with: docker pull sigven/pcgr:vX.X.X

c) Run PCGR Docker Container

If you are familiar with working with Docker volumes (https://docs.docker.com/storage/volumes/) you can run PCGR using Docker instead of conda using the -v <host>:<container> Docker option. You’ll need to map your PCGR inputs to Docker container paths.

For example, say you have the input VCF sampleX.vcf.gz stored in the directory /Users/you/project1. You would need to supply Docker with a --volume (or -v) option mapping the directory of that VCF with a directory inside the Docker container, e.g. /home/input_vcf_dir. That would become: -v /Users/you/project1:/home/input_vcf_dir (note the : separating your directory from the container’s directory).

Then your command would look something like this:

docker container run -it --rm \
    -v /Users/you/dir1/data:/root/pcgr_data \
    -v /Users/you/dir2/pcgr_inputs:/root/pcgr_inputs \
    -v /Users/you/dir3/pcgr_outputs:/root/pcgr_outputs \
    sigven/pcgr:1.4.1 \
    pcgr \
      --input_vcf "/root/pcgr_inputs/tumor_sample.BRCA.vcf.gz" \
      --pcgr_dir "/root/pcgr_data" \
      --output_dir "/root/pcgr_outputs" \
      --genome_assembly "grch38" \
      --sample_id "SampleB" \
      --assay "WGS" \
      --vcf2maf
  • Note the path mappings. You’re using the container paths in the command, not the host (your machine’s) paths.