Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 162 add assembler qc report #167

Merged
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@ testing*
.screenrc
eggnog
kofam/
eukulele/
6 changes: 5 additions & 1 deletion CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@

- [Prodigal](https://github.com/hyattpd/Prodigal)

- [BBmap] (https://sourceforge.net/projects/bbmap/)
- [BBmap](https://sourceforge.net/projects/bbmap/)

- [FeatureCounts](https://subread.sourceforge.net)

Expand Down Expand Up @@ -73,8 +73,12 @@
- [EUKulele](https://github.com/AlexanderLabWHOI/EUKulele)

- [CAT](https://github.com/dutilh/CAT)

tfalkarkea marked this conversation as resolved.
Show resolved Hide resolved
> von Meijenfeldt FAB, Arkhipova K, Cambuy DD, Coutinho FH, Dutilh BE. Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT. Genome Biology. 2019;20:217.

- [transrate](https://hibberdlab.com/transrate/)
> TransRate: reference free quality assessment of de-novo transcriptome assemblies (2016). Richard D Smith-Unna, Chris Boursnell, Rob Patro, Julian M Hibberd, Steven Kelly. Genome Research doi: [http://dx.doi.org/10.1101/gr.196469.115](http://dx.doi.org/10.1101/gr.196469.115)

## Software packaging/containerisation tools

- [Anaconda](https://anaconda.com)
Expand Down
14 changes: 14 additions & 0 deletions assets/multiqc_config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,17 @@ report_section_order:
order: -1002

export_plots: true

custom_data:
megahit_assemblies:
description: "Describes assembly statistics, generated by TransRate."
plot_type: table
rnaspades_assemblies:
description: "Describes assembly statistics, generated by TransRate."
plot_type: table

custom_plot_config:
megahit_assemblies-plot:
col1_header: "File Name"
rnaspades_assemblies-plot:
col1_header: "File Name"
42 changes: 42 additions & 0 deletions modules/local/transrate.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
process TRANSRATE {
tag "$meta.id"
label 'process_low'

conda "bioconda::transrate=1.0.3"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/transrate:1.0.3--hec16e2b_4':
'biocontainers/transrate:1.0.3--hec16e2b_4' }"

input:
tuple val(meta), path(assembly)

output:
tuple val(meta), path("*assemblies_mqc.csv") , emit: assembly_qc
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"

// transrate flashes a warning about a ruby gem being out of date, so call the version before it is being piped into the yaml
tfalkarkea marked this conversation as resolved.
Show resolved Hide resolved
"""
gzip -d -c $assembly > assembly_unzipped.fa

transrate \\
--threads $task.cpus \\
--assembly assembly_unzipped.fa \\
--output ${prefix}_transrate \\
$args
tfalkarkea marked this conversation as resolved.
Show resolved Hide resolved

mv ${prefix}_transrate/assemblies.csv ${prefix}_assemblies_mqc.csv

transrate --version > version.txt
cat <<-END_VERSIONS > versions.yml
"${task.process}":
transrate: \$(cat version.txt)
END_VERSIONS
"""
}
11 changes: 10 additions & 1 deletion workflows/metatdenovo.nf
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,7 @@ include { FORMATSPADES } from '../modules/local/formatspades
include { UNPIGZ as UNPIGZ_CONTIGS } from '../modules/local/unpigz'
include { UNPIGZ as UNPIGZ_GFF } from '../modules/local/unpigz'
include { MERGE_TABLES } from '../modules/local/merge_summary_tables'
include { TRANSRATE } from '../modules/local/transrate'

//
// SUBWORKFLOW: Consisting of a mix of local and nf-core/modules
Expand Down Expand Up @@ -317,6 +318,13 @@ workflow METATDENOVO {
ch_versions = ch_versions.mix(MEGAHIT_INTERLEAVED.out.versions)
}

//
// MODULE: Use TransRate to judge assembly quality
//
TRANSRATE(ch_assembly_contigs)
tfalkarkea marked this conversation as resolved.
Show resolved Hide resolved
ch_versions = ch_versions.mix(TRANSRATE.out.versions)


//
// Call ORFs
//
Expand Down Expand Up @@ -385,7 +393,7 @@ workflow METATDENOVO {
BAM_SORT_STATS_SAMTOOLS ( BBMAP_ALIGN.out.bam, ch_assembly_contigs )
ch_versions = ch_versions.mix(BAM_SORT_STATS_SAMTOOLS.out.versions)

// if ( orf_caller ==
// if ( orf_caller ==
BAM_SORT_STATS_SAMTOOLS.out.bam
.combine(ch_gff.map { it[1] } )
.set { ch_featurecounts }
Expand Down Expand Up @@ -526,6 +534,7 @@ workflow METATDENOVO {

ch_multiqc_files = ch_multiqc_files.mix(CUSTOM_DUMPSOFTWAREVERSIONS.out.mqc_yml.collect())
ch_multiqc_files = ch_multiqc_files.mix(FASTQC_TRIMGALORE.out.trim_zip.collect{it[1]}.ifEmpty([]))
ch_multiqc_files = ch_multiqc_files.mix(TRANSRATE.out.assembly_qc.collect{it[1]}.ifEmpty([]))
ch_multiqc_files = ch_multiqc_files.mix(BAM_SORT_STATS_SAMTOOLS.out.idxstats.collect{it[1]}.ifEmpty([]))
ch_multiqc_files = ch_multiqc_files.mix(FEATURECOUNTS_CDS.out.summary.collect{it[1]}.ifEmpty([]))

Expand Down
Loading