-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WXS/Targeted - intersect_regions
param for call-sSNV in template.config
#183
Comments
I think it's worth adding a section about WXS/targeted-seq runs to the README as well? |
@Faizal-Eeman and I think it's worth considering having an |
I second with @sorelfitzgibbon it's better to have |
For non-WGS samples what is the set of all pipelines that use intersect regions?
I think that's it right? |
I will disagree on having a single parameter to try to handle intervals. Intervals serve different functions across the pipelines that use them and the absence of intervals means different things. Because of that, users should really be able to understand what the intervals mean for each of the individual pipelines and decide on adding that parameter per-pipeline. Additionally, the concept of a single set of intervals doesn't work since the defaults in WGS mode are different across pipelines, meaning different logic would need to be added at the metapipeline level that really belongs within the individual pipelines. I agree that the documentation needs to be updated (and it is in the process of being updated) to provide guidelines for the different metapipeline run modes (WGS, WXS, targeted, single sample, paired sample, multi sample). |
@yashpatel6 can you point to a specific issue with using global intervals? From what I can discern, one set of global intervals would be good in the majority of cases (certainly for WGS). There could be specific pipeline override intervals options, if needed, for occasional cases where e.g. someone wants to calculate coverage on the global interval regions but call variants on larger regions. But would global regions not be good even for most exome runs? Here's what I see for the set of pipelines used:
|
convert-BAM2FASTQ - we could extract reads using an interval BED but generally not ideal (losing off target reads). call-mtSNV - Ideally, the pipeline should look for call-sCNA - FACETS can call CNAs for exomes |
One example: WGS mode for call-sSNV vs. for call-gSNP and recalibrate-BAM - call-sSNV recommends that you use non-decoy intervals for WGS, the others require the intervals to be left empty for default WGS mode. This ends up creating a need for more than one parameter to handle these if the metapipeline were to handle them.
recalibrate-BAM does accept/use intervals
The pipeline does take intervals and intervals passed to the metapipeline are passed to the pipeline when given
Not exactly, see example above for the issue in WGS mode
Agreed, the BAM2FASTQ pipeline isn't intended to work on intervals and I don't think it should either
If this type of logic would be implemented, it should happen within call-mtSNV; I disagree with using something like intervals at the metapipeline level to forcefully run a pipeline that wasn't requested - that convolutes the whole pipeline selection process and added a lot more confusion to end users about pipelines automatically running when they weren't requested.
It can but intervals aren't required for FACETS unless I'm mistaken |
Yup, I totally agree on this. |
I believe removing the decoy regions would only be a plus for WGS call-gSNP and BQSR |
FACETS in call-sCNA v3.1.0 isn't set up for exomes but the tool does have an option to input target BED intervals. Something to add at call-sCNA level. |
That may be though I don't think removing decoy regions has actually been assessed in the context of BQSR/call-gSNP. Also, on a conceptual level, even if a read is mapped onto a decoy contig, there's no reason it should be excluded from the base quality score recalibration since the base quality relates more to the actual sequencing quality rather than mapping |
After some discussion, there will be more options in the metapipeline to control behavior for exome/targeted behavior so I'll look into adding a dedicated params section for it. We'll still want to handle a couple of things before that: looking into matching up WGS behavior between call-sSNV and other pipelines (likely with an interval extraction process that automatically removes decoy contigs) and recalibrate-BAM/call-gSNP will need to be looked into for any potential effects of removing decoy contigs |
For non-WGS samples, WXS or Targeted, it is unclear if the call-sSNV param
intersect_regions
SHOULD BE changed to their respective target BED file as the default if set toHomo_sapiens_assembly38_no-decoy.bed.gz
.Maybe a comment would be of good guidance in this section - to change the default path to a respective BED file.
metapipeline-DNA/config/template.config
Lines 90 to 98 in 59b9663
The text was updated successfully, but these errors were encountered: