-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: update cnvkit pons #1465
feat: update cnvkit pons #1465
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## deduplicate_with_umi #1465 +/- ##
=====================================================
Coverage 99.48% 99.49%
=====================================================
Files 40 40
Lines 1957 1976 +19
=====================================================
+ Hits 1947 1966 +19
Misses 10 10
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
## Added - padding of bed-regions for CNVkit to minimum 100 base
…SAMIC into update_cnvkit_pons
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, neat implementation 🧙♂️ 🪄
BALSAMIC/snakemake_rules/variant_calling/somatic_cnv_tumor_normal_tga.rule
Show resolved
Hide resolved
…ALSAMIC into deduplicate_with_umi
…ALSAMIC into deduplicate_with_umi
…ALSAMIC into deduplicate_with_umi
…ALSAMIC into deduplicate_with_umi
…ALSAMIC into deduplicate_with_umi
…ALSAMIC into deduplicate_with_umi
…ALSAMIC into deduplicate_with_umi
…ALSAMIC into deduplicate_with_umi
…ALSAMIC into deduplicate_with_umi
Quality Gate passedIssues Measures |
This PR adds post-processing steps to CNVkit results from TGA to facilitate upload to GENS, which has previously only been possible for WGS via post-processing of the GATK CollectReadCounts output. As the gnomad vcf is required as well for the creation of the BAF visualisation track in GENS the config and the GENS rule assignment has been modified to make it possible to use of these rules and references in TGA as well. And additional little script was added to massage the CNVkit file tumor.merged.cnr into a GENS accepted format with different resolutions. #### Added - Script to post-process CNVkit output to GENS-format - DNAscope gnomad calling to TGA for GENS #### Changed - Parsing of GENS arguments changed to account for TGA
Quality Gate passedIssues Measures |
#### Added - UMI extraction and deduplication to TGA workflow - Adapter trimming of fastqs to UMI workflow - Cap base quality in bam for Manta input #### Changed - Refactored multi workflow rule-files to separate files to decrease complexity - Refactored output files to in general comply with format {sample_type}.{sample_name} - Replaced Picard QC tools with matching Sentieon QC tools #### Removed - UMI specific rules for UMI-extraction and alignment (using new TGA-rules instead) - Fastq and UMI trimming command-line options Merged this PR into this one: #1465 #### Added - Added extension of target bed regions to a minimum size of 100 for CNV analysis - PON for: Exome comprehensive 10.2 - PON for: GMSsolid 15.2 - PON for: GMCKsolid 4.2 #### Changed - updated PON for GMCKSolid v4.1 - updated PON for GMSMyeloid v5.3 - updated PON for GMSlymphoid v7.3 Merged this PR into this one: #1448 #### Added - Script to post-process CNVkit output to GENS-format - DNAscope gnomad calling to TGA for GENS #### Changed - Parsing of GENS arguments changed to account for TGA Merged this PR: #1475 into this one #### Changed - Refactored rules for bcftools filters - Renamed final UMI bamfile to ensure hsmetrics are collected in multiqc json - Changed ranked VCF from research to clincial - Lowered min AF for TGA from 0.007 to 0.005 - Lowered maximal SOR for TNscope in TGA tumor only cases from 3 to 2.7 - Changed filter settings for research TNscope vcf, now either PASS or triallelic_site (fixing this issue: #1293) #### Added - TNscope for TGA workflows, merged with VarDict results - New filter for VarDict for tumor in normal contamination - Export TMP environment variables to rules that lack them - Added genmod ranked VCFs to be delivered - Added family-id to genmod in order to get ranked variants to Scout (solved this: #1045) - Added DP and AF to INFO-field of TNscope vcfs for ranking model - Raw TNscope calls and unfiltered research-annotated SNVs to delivery #### Removed - ML-model for TNscope is removed due to license issue with new version of Sentieon - All code associated with TNhaplotyper - Removed research.filtered.pass VCFs from delivery and storage list
Description
THIS PR IS WAS BLOCKED BY THIS: https://github.com/Clinical-Genomics/target_capture_bed/issues/133
Now solved by this: #1469
With the change to the BAM files in the TGA workflows to start using UMIs for removing duplicates we will need to rebuild the PONs for all TGA workflows:
At the same time there are issues with the current GMSmyeloid PON with reports of noisy results which may be due to the fact that the PON was built using tumor samples and with a mix of samples from an earlier version of the panel. So while re-building the PON we might take the time to choose better samples if any are available.
Also for the other panels it would be good to re-evaluate which samples we have and can use to make sure that we're using samples that are up to date with our methods.
Finally, we can see if there are any other panels available for which we have enough samples to create a PON. Such as the one mentioned here: #1460
Tasks:
Added google-sheet here: https://docs.google.com/spreadsheets/d/18vs_2MKk-IyByjGMqEptdSlfcn9mB9bbpx6DahCA_a4/edit?gid=1393363517#gid=1393363517
Added
Changed
Documentation
Tests
Feature Tests
Pipeline Integrity Tests
.hk
file)Clinical Genomics Stockholm
Documentation
Panel of Normal specific criteria
User Changes
Infrastructure Changes
Checklist
Important
Ensure that all checkboxes below are ticked before merging.
For Developers
For Reviewers
conditions where applicable, with satisfactory results.