The pipeline is to evaluate the imputation performance and accuracy of different arrays starting from sequence data.
It masks non tag variants for each array, and then impute to a reference panel using Minimac.
It is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner.
It comes with singularity containers making installation trivial and results highly reproducible.
The evaluate_chips pipeline comes with documentation about the pipeline, found in the docs/
directory:
- Nextflow (can be installed as local user)
- NXF_HOME needs to be set, and must be in the PATH
- Note that we've experienced problems running Nextflow when NXF_HOME is on an NFS mount.
- The Nextflow script also needs to be invoked in a non-NFS folder
- Java 1.8+
-
The compute nodes need to have singularity installed.
-
The compute nodes need access to shared storage for input, references, output
-
The following commands need to be available in PATH on the compute nodes, in case of unavailabitity of singularity.