Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

Yf contaminate and estimate bid #384

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

yfarjoun
Copy link

@yfarjoun yfarjoun commented Dec 4, 2020

just a quick draft to get comments on my first attempt at using dagr.

@tfenne

/*
* The MIT License
*
* Copyright (c) 2016 Fulcrum Genomics LLC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Copyright (c) 2016 Fulcrum Genomics LLC
* Copyright (c) 2020 Fulcrum Genomics LLC

* and run VBID with different number of target snps to evaluate
* dependance of VBID on contamination & depth & number of target snps
*/
@clp(description = "Example FASTQ to BAM pipeline.", group = classOf[Pipelines])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Longer description? Also, not a FASTQ to BAM pipeline

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the output files it creates?

val tmpBam = out.resolve(prefix + ".tmp.bam")
val metricsPrefix: Some[DirPath] = Some(out.resolve(prefix))
Files.createDirectories(out)
val bamYield = new mutable.HashMap[PathToBam, Int]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this thread safe?

bams.map(bam => {
val metricPath: Path = out.resolve(prefix + bam.getFileName + ".qualityYieldMetrics")
val hsMetrics = new CollectHsMetrics(in = bam, ref = ref, targets = targets, prefix = Some(out.resolve(prefix)))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra newline?

val pairsOfBam = for (x <- bams; y <- bams) yield (x, y)

pairsOfBam.foreach { case (x, y) => {
contaminations.foreach(c => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
contaminations.foreach(c => {
contaminations.foreach { c =>

val bamYield = new mutable.HashMap[PathToBam, Int]

bams.map(bam => {
val metricPath: Path = out.resolve(prefix + bam.getFileName + ".qualityYieldMetrics")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the suffix be the same one that CollectMultipleMetrics would produce? Then MultiQC can be used too!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this qualityYieldMetrics when the output is presumably from HsMetrics?

Comment on lines 103 to 130
val xyContaminatedBam = out.resolve(prefix + "__" + x.getFileName + "__" + y.getFileName +
"__" + c + ".bam")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why double underscores? also, use string interpolation.

Suggested change
val xyContaminatedBam = out.resolve(prefix + "__" + x.getFileName + "__" + y.getFileName +
"__" + c + ".bam")
val xyContaminatedBam = out.resolve(f"${prefix}__${x.getFileName}__${y.getFileName}__${c}.bam")


root ==> downsample ==> mergedownsampled

depths filter (d => d <= resultantDepth) foreach { d => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
depths filter (d => d <= resultantDepth) foreach { d => {
depths.filter(_ <= resultantDepth).foreach { d =>


depths filter (d => d <= resultantDepth) foreach { d => {
val outDownsampled = out.resolve(prefix + "__" + x.getFileName + "__" + y.getFileName +
"__" + c + "__target__" + d + ".bam")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, double underscore

/*
* The MIT License
*
* Copyright (c) 2015 Fulcrum Genomics LLC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

license

@yfarjoun yfarjoun force-pushed the yf_contaminate_and_estimate_bid branch from 90e526b to f48ab5c Compare February 14, 2021 19:20
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants