Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sfitz compress readcounts #203

Merged
merged 27 commits into from
Aug 30, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
7b095bb
compress QC readcount file
sorelfitzgibbon Jul 16, 2023
d1a2055
typos
sorelfitzgibbon Jul 16, 2023
d597a37
gzip --stdout to avoid circular links
sorelfitzgibbon Jul 17, 2023
bdaf8d0
add compress_readcount process
sorelfitzgibbon Aug 11, 2023
0dac513
merge in accidental new branch
sorelfitzgibbon Aug 11, 2023
46b1f4e
Merge branch 'sfitz-update-readme' of github.com:uclahs-cds/pipeline-…
sorelfitzgibbon Aug 14, 2023
cfe24fd
Merge branch 'sfitz-update-readme' of github.com:uclahs-cds/pipeline-…
sorelfitzgibbon Aug 14, 2023
a3e0acd
change to bzip2 in progress
sorelfitzgibbon Aug 15, 2023
1276e98
gzip -> bzip2 still in progress
sorelfitzgibbon Aug 16, 2023
caecb59
fix blarchive docker typo
sorelfitzgibbon Aug 16, 2023
4f3afaf
deference file for bzip2
sorelfitzgibbon Aug 16, 2023
481e185
maf gzip to bzip2
sorelfitzgibbon Aug 16, 2023
b1cd3cd
finish changing maf compression to bzip2
sorelfitzgibbon Aug 16, 2023
d355f92
merge in sfitz-update-readme
sorelfitzgibbon Aug 16, 2023
30463a5
update changelog
sorelfitzgibbon Aug 16, 2023
368c305
move bzip2 process to common in progress
sorelfitzgibbon Aug 18, 2023
f559ae4
move bzip2 process to common still in progress
sorelfitzgibbon Aug 19, 2023
c027a26
Merge branch 'main' of github.com:uclahs-cds/pipeline-call-sSNV into …
sorelfitzgibbon Aug 19, 2023
6ac0ef1
in progress
sorelfitzgibbon Aug 19, 2023
fe6c3ad
bzip2 to common complete, pipeline may be hanging upon completion
sorelfitzgibbon Aug 20, 2023
3c71cbe
final bz2, fix log output dirs, indentation
sorelfitzgibbon Aug 20, 2023
3bca0be
update changelog
sorelfitzgibbon Aug 20, 2023
ee8b4a1
update to blarchive v2.0.0
sorelfitzgibbon Aug 23, 2023
204a08f
rm readcount from intermediate and add compress_file_blarchive to rea…
sorelfitzgibbon Aug 25, 2023
8d66e15
change readcount log folder name
sorelfitzgibbon Aug 28, 2023
c09c3c8
move compressed readcount output to intermediate
sorelfitzgibbon Aug 30, 2023
a1b797e
fix changelog
sorelfitzgibbon Aug 30, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 27 additions & 2 deletions module/somaticsniper-processes.nf
Original file line number Diff line number Diff line change
Expand Up @@ -206,9 +206,10 @@ process create_ReadCountPosition_SomaticSniper {
// Recommend to use the same mapping quality -q setting as SomaticSniper
process generate_ReadCount_bam_readcount {
container params.docker_image_bam_readcount
publishDir path: "${params.workflow_output_dir}/QC/${task.process.split(':')[-1]}",
publishDir path: "${params.workflow_output_dir}/intermediate/${task.process.split(':')[-1]}",
mode: "copy",
pattern: "*.readcount"
pattern: "*.readcount",
enabled: params.save_intermediate_files
publishDir path: "${params.workflow_log_output_dir}",
mode: "copy",
pattern: ".command.*",
Expand Down Expand Up @@ -268,6 +269,30 @@ process filter_FalsePositive_SomaticSniper {
"""
}

// After running fpfilter.pl above, readcount file can now be compressed
process compress_readcount_SomaticSniper {
container params.docker_image_bam_readcount
publishDir path: "${params.workflow_output_dir}/QC/${task.process.split(':')[-1]}",
mode: "copy",
pattern: "*.readcount.gz"
publishDir path: "${params.workflow_log_output_dir}",
mode: "copy",
pattern: ".command.*",
saveAs: { "${task.process.split(':')[-1]}/log${file(it).getName()}" }

input:
path readcount_file

output:
path "*.readcount.gz"
path ".command.*"

"""
set -euo pipefail
gzip --stdout $readcount_file > ${readcount_file}.gz
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally want to recommend bzip2 based on Yash's recent benchmark. https://github.com/uclahs-cds/tool-archive-data/discussions/25 We would also want to encourage lab members to use this package more. @yashpatel6

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally agree on bzip2!

The package can be used but it doesn't have the autobuild action in it yet so the Docker's out of date at the moment (will be fixed soon though)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converting this to draft to wait until bzip2 module is working

"""
}

// To obtain the "high confidence" set based on further filtering of the somatic score and mapping quality
process call_HighConfidenceSNV_SomaticSniper {
container params.docker_image_somaticsniper
Expand Down
3 changes: 2 additions & 1 deletion module/somaticsniper.nf
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
include { call_sSNV_SomaticSniper; convert_BAM2Pileup_SAMtools; create_IndelCandidate_SAMtools; apply_NormalIndelFilter_SomaticSniper; apply_TumorIndelFilter_SomaticSniper; create_ReadCountPosition_SomaticSniper; generate_ReadCount_bam_readcount; filter_FalsePositive_SomaticSniper; call_HighConfidenceSNV_SomaticSniper } from './somaticsniper-processes'
include { call_sSNV_SomaticSniper; convert_BAM2Pileup_SAMtools; create_IndelCandidate_SAMtools; apply_NormalIndelFilter_SomaticSniper; apply_TumorIndelFilter_SomaticSniper; create_ReadCountPosition_SomaticSniper; generate_ReadCount_bam_readcount; filter_FalsePositive_SomaticSniper; compress_readcount_SomaticSniper; call_HighConfidenceSNV_SomaticSniper } from './somaticsniper-processes'
include { rename_samples_BCFtools; generate_sha512sum } from './common'
include { compress_index_VCF as compress_index_VCF_hc } from '../external/pipeline-Nextflow-module/modules/common/index_VCF_tabix/main.nf' addParams(
options: [
Expand Down Expand Up @@ -50,6 +50,7 @@ workflow somaticsniper {
create_ReadCountPosition_SomaticSniper(apply_TumorIndelFilter_SomaticSniper.out.vcf_tumor)
generate_ReadCount_bam_readcount(params.reference,create_ReadCountPosition_SomaticSniper.out.snp_positions, tumor_bam, tumor_index)
filter_FalsePositive_SomaticSniper(apply_TumorIndelFilter_SomaticSniper.out.vcf_tumor, generate_ReadCount_bam_readcount.out.readcount)
compress_readcount_SomaticSniper(generate_ReadCount_bam_readcount.out.readcount)
call_HighConfidenceSNV_SomaticSniper(filter_FalsePositive_SomaticSniper.out.fp_pass)
// rename_samples_BCFtools needs bgzipped input
compress_index_VCF_hc(call_HighConfidenceSNV_SomaticSniper.out.hc
Expand Down