This pipeline offers a end-to-end workflow for exome analysis using several toolchains
-
trimming with fastp
-
duplicate marking using Samtools
-
germline SNP/INDEL calling with Deepvariant, Strelka and/or GATK
-
variant effect prediction with VEP and/or Haplosaurus
-
protein-level effect prediction with Haplosaurus and BCFtools
-
germline and somatic SV calling using Manta
-
germline and somatic CNV calling using CNVkit
-
Repeat expansion detection using ExpansionHunter
The result will be a multi-sample VCF file as well as a list of VCF files for each sample.
- What happens in this pipeline?
- Installation and configuration
- Running the pipeline
- Output
- Troubleshooting
- Developer guide
Benchmarking of release 4.3 against genome-in-a-bottle NA12878
IDT xGEN, in-house:
Variant | Recall | Precision |
---|---|---|
Indel | 0.92 | 0.98 |
SNP | 0.994 | 0.999 |
Agilent v7, external:
Variant | Recall | Precision |
---|---|---|
Indel | 0.93 | 0.97 |
SNP | 0.99 | 0.998 |