This is a Nextflow re-implementation of the original pipeline used by Computational and Systems Biology Group 5 (CSB5) at the Genome Institute of Singapore (GIS).
- Add customized HUMAnN2 to a conda channel
- Add nf-core style documentation
- Output description
- Installation
- Usage
- Reference databases
- The new DSL2 syntax for pipeline modularity and reusabiligy
- Dockerfile for each software (all containers can be found at DockerHub)
- Conda recipe for each software/step
- Configuration for local execution (server), GIS HPC (using SGE schedular), AWS batch and AWS auto-scaling cluster
- Nextflow
- Java Runtime Environment >= 1.8
- Fastp (>=0.20.0): Adapter trimming, low quality base trimming
- BWA (>=0.7.17): Host DNA removal
- Samtools (>=1.7): Host DNA removal
- Kraken2 (>=2.0.8-beta) + Bracken (>=2.5): Taxonomic profiling
- MetaPhlAn2 (>=2.7.7): Taxonomic profiling
- HUMAnN2 (>=2.8.1): Pathway analysis. The following two files are modified to read a SAM file from standard input (See Running HUMAnN2 with reduced disk storage):
humann2.py
search/nucleotide.py
- SRST2 (=0.2.0): Resistome profiling
- Pipeline parameters
- Run on GIS HPC cluster (SGE scheduler)
- Run on AWS configured for CSB5
- Run on NSCC
Run with docker
$ shotgunmetagenomics-nf/main.nf -profile docker --read_path PATH_TO_READS
Run on AWS batch (AWS batch configuration tutorial)
- IAM configuration (set environment variables AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_DEFAULT_REGION)
- Batch compute environment & job queue
- Customized AMI (AWS ECS optimized linux + awscli installed with miniconda)
$ shotgunmetagenomics-nf/main.nf -profile awsbatch --awsqueue AWSBATCH_QUEUE --awsregion AWS_REGION --bucket-dir S3_BUCKET --outdir S3_BUCKET
You can specifiy multiple profiles separated by comma, e.g. -profile docker,test
.
Run multiple profilers
$ shotgunmetagenomics-nf/main.nf -profile gis --profilers kraken2,metaphlan2 --read_path PATH_TO_READS
- Chng et al. Whole metagenome profiling reveals skin microbiome dependent susceptibility to atopic dermatitis flares. Nature Microbiology (2016)
- Nandi et al. Gut microbiome recovery after antibiotic usage is mediated by specific bacterial species. BioRxiv (2018)
- Chng et al. Cartography of opportunistic pathogens and antibiotic resistance genes in a tertiary hospital environment. BioRxiv (2019)
- Write a module and put it into
modules/
- Add to the main script
main.nf
- Modify the configuration file
conf/base.config
to add resources required (for GIS users, modifyconf/gis.config
as well for the specific conda envrionment) - Add conda and docker files for the new module
Chenhao Li: lich@gis.a-star.edu.sg, lichenhao.sg@gmail.com