Sfitz concat vcf #213

sorelfitzgibbon · 2023-07-27T21:52:31Z

Description

Add BCFtools process to concatenate the 2+ tool consensus variants into one VCF. The output header is a uniquified concatenation of all headers. The output fields: INFO FORMAT NORMAL and TUMOR are from the first listed VCF that has the variant.

Testing Results

nftest run a_mini_n2-all-tools-std-input
log: /hot/software/pipeline/pipeline-call-sSNV/Nextflow/development/unreleased/sfitz-concat-vcf/log-nftest-20230810T214722Z.log
output: /hot/software/pipeline/pipeline-call-sSNV/Nextflow/development/unreleased/sfitz-concat-vcf/a_mini_n2-all-tools-std-input

Checklist

I have read the code review guidelines and the code review best practice on GitHub check-list.
I have reviewed the Nextflow pipeline standards.
The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)]-[brief_description_of_branch].
I have set up or verified the branch protection rule following the github standards before opening this pull request.
I have added my name to the contributors listings in the manifest block in the nextflow.config as part of this pull request; I am listed already, or do not wish to be listed. (This acknowledgement is optional.)
I have added the changes included in this pull request to the CHANGELOG.md under the next release version or unreleased, and updated the date.
I have updated the version number in the metadata.yaml and manifest block of the nextflow.config file following semver, or the version number has already been updated. (Leave it unchecked if you are unsure about new version number and discuss it with the infrastructure team in this PR.)
I have tested the pipeline on at least one A-mini sample.

sorelfitzgibbon · 2023-07-27T23:28:48Z

Converting to a draft as I realized the output needs to be uncompressed for the next step and some checksums need to be added.

sorelfitzgibbon · 2023-07-28T00:38:25Z

module/intersect-processes.nf

+    publishDir path: "${params.workflow_output_dir}/intermediate/${task.process.split(':')[-1]}",
+        mode: "copy",
+        pattern: "*concat.vcf",
+        enabled: params.save_intermediate_files


this intermediate file will be used by vcf2maf (and has to be uncompressed)

maotian06

One minor naming suggestion! Otherwise looks good to me!
Anything else @yashpatel6

main.nf

sorelfitzgibbon · 2023-08-03T20:17:02Z

config/F16.config

I haven't done any runs with large samples since adding plot_VennDiagram_R or concat_VCFs_BCFtools so these are just guesses. These two processes will run together, but only after everything is done. I doubt they use much memory so I don't think it matters much. The next PR, add maf, will add one more process and may be the last PR before release. With that I could test with large samples and look at memory as well as which processes will use more cpus.

module/intersect-processes.nf

yashpatel6

A couple of comments:

yashpatel6 · 2023-08-03T21:19:47Z

module/intersect-processes.nf

+    publishDir path: "${params.workflow_output_dir}/output",
+        mode: "copy",
+        pattern: "isec-1-or-more/*.txt"


Any reason this was moved here from the intersect process? Generally, we want to publish files from the process that generated them

module/intersect-processes.nf

yashpatel6

Couple of minor edits to make but otherwise looks good!

main.nf

r-scripts/plot-venn.R

tyamaguchi-ucla

Looks good! I've added a few comments/questions.

tyamaguchi-ucla · 2023-08-04T01:04:33Z

config/F16.config

@@ -78,4 +78,24 @@ process {
            }
        }
    }
+    withName: plot_VennDiagram_R {
+        cpus = 2


Can VennDiagram take 2 CPUs?

As I sort of mentioned in the PR description, these are just placeholders until I do a large bam test run before the release. I will adjust these and the processes added in sfitz-add-maf at that time.

tyamaguchi-ucla · 2023-08-04T01:05:33Z

config/F16.config

+        }
+    }
+    withName: concat_VCFs_BCFtools {
+        cpus = 2


It looks like this process doesn't use 2 CPUs?

module/intersect-processes.nf

r-scripts/plot-venn.R

yashpatel6

Looks good! The resources allocations can be tuned after running the large sample in a future PR. Anything else to add @tyamaguchi-ucla ?

module/intersect.nf

yashpatel6

Looks good! Anything else to add @tyamaguchi-ucla ?

tyamaguchi-ucla

Looks good to me although we might want to think about the structure under intersect-BCFtools-1.17 before the next release. Excellent work!

sorelfitzgibbon added 3 commits July 26, 2023 12:34

add concat VCF process. untested. need to add chksum

d64ad13

concat vcf works

c11fc68

update changelog

06c4254

sorelfitzgibbon marked this pull request as draft July 27, 2023 23:29

sorelfitzgibbon marked this pull request as ready for review July 28, 2023 00:35

uncompressed concat vcf output and chksums

998703c

sorelfitzgibbon commented Jul 28, 2023

View reviewed changes

sorelfitzgibbon requested review from yashpatel6 and maotian06 July 28, 2023 00:38

tyamaguchi-ucla assigned maotian06 Aug 1, 2023

maotian06 approved these changes Aug 2, 2023

View reviewed changes

main.nf Outdated Show resolved Hide resolved

outfile: Intersect -> Consensus

764d04f

tyamaguchi-ucla assigned yashpatel6 Aug 2, 2023

sorelfitzgibbon changed the base branch from sfitz-plot-intersections to main August 2, 2023 23:14

sorelfitzgibbon added 4 commits August 2, 2023 16:21

merge in main

9879500

fix merge mistake and update resource allocation

515c12a

use file.path() and update changelog

26fcfc9

mv output of isec-1-or-more/*.txt and rm log task numbers

8ba1b41

sorelfitzgibbon commented Aug 3, 2023

View reviewed changes

sorelfitzgibbon requested a review from tyamaguchi-ucla August 3, 2023 20:19

yashpatel6 reviewed Aug 3, 2023

View reviewed changes

isec-1-or-more output fix and Consensus -> BCFtools filenames

10817de

yashpatel6 approved these changes Aug 4, 2023

View reviewed changes

main.nf Outdated Show resolved Hide resolved

r-scripts/plot-venn.R Outdated Show resolved Hide resolved

minor edits

9b4bc69

tyamaguchi-ucla reviewed Aug 4, 2023

View reviewed changes

sorelfitzgibbon added 3 commits August 7, 2023 14:30

sort vcfs into isec and concat

269d016

standardize isec output filenames

db0133c

fix standardized isec output and spacing in rscript

027f02c

sorelfitzgibbon requested a review from yashpatel6 August 8, 2023 17:43

sorelfitzgibbon requested a review from tyamaguchi-ucla August 8, 2023 17:43

fix standard filename output

18fe92b

yashpatel6 approved these changes Aug 9, 2023

View reviewed changes

sorelfitzgibbon marked this pull request as draft August 9, 2023 17:04

sorelfitzgibbon added 2 commits August 9, 2023 11:52

vcf sort fix in progress

836db28

vcf sort fixed

8816004

sorelfitzgibbon marked this pull request as ready for review August 10, 2023 20:27

sorelfitzgibbon requested a review from yashpatel6 August 10, 2023 20:31

yashpatel6 reviewed Aug 10, 2023

View reviewed changes

module/intersect.nf Outdated Show resolved Hide resolved

function for vcf sort

3fab4c2

yashpatel6 approved these changes Aug 10, 2023

View reviewed changes

tyamaguchi-ucla approved these changes Aug 10, 2023

View reviewed changes

sorelfitzgibbon mentioned this pull request Aug 10, 2023

think about structure under intersect-BCFtools-1.17 #221

Open

sorelfitzgibbon merged commit 0ceb6e8 into main Aug 10, 2023
1 check passed

sorelfitzgibbon deleted the sfitz-concat-vcf branch August 10, 2023 23:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sfitz concat vcf #213

Sfitz concat vcf #213

sorelfitzgibbon commented Jul 27, 2023 •

edited

Loading

sorelfitzgibbon commented Jul 27, 2023

sorelfitzgibbon Jul 28, 2023 •

edited

Loading

maotian06 left a comment

sorelfitzgibbon Aug 3, 2023

yashpatel6 left a comment

yashpatel6 Aug 3, 2023

yashpatel6 left a comment

tyamaguchi-ucla left a comment

tyamaguchi-ucla Aug 4, 2023

sorelfitzgibbon Aug 5, 2023

tyamaguchi-ucla Aug 4, 2023

yashpatel6 left a comment

yashpatel6 left a comment

tyamaguchi-ucla left a comment

Sfitz concat vcf #213

Sfitz concat vcf #213

Conversation

sorelfitzgibbon commented Jul 27, 2023 • edited Loading

Description

Testing Results

Checklist

sorelfitzgibbon commented Jul 27, 2023

sorelfitzgibbon Jul 28, 2023 • edited Loading

Choose a reason for hiding this comment

maotian06 left a comment

Choose a reason for hiding this comment

sorelfitzgibbon Aug 3, 2023

Choose a reason for hiding this comment

yashpatel6 left a comment

Choose a reason for hiding this comment

yashpatel6 Aug 3, 2023

Choose a reason for hiding this comment

yashpatel6 left a comment

Choose a reason for hiding this comment

tyamaguchi-ucla left a comment

Choose a reason for hiding this comment

tyamaguchi-ucla Aug 4, 2023

Choose a reason for hiding this comment

sorelfitzgibbon Aug 5, 2023

Choose a reason for hiding this comment

tyamaguchi-ucla Aug 4, 2023

Choose a reason for hiding this comment

yashpatel6 left a comment

Choose a reason for hiding this comment

yashpatel6 left a comment

Choose a reason for hiding this comment

tyamaguchi-ucla left a comment

Choose a reason for hiding this comment

sorelfitzgibbon commented Jul 27, 2023 •

edited

Loading

sorelfitzgibbon Jul 28, 2023 •

edited

Loading