bowtie2 subprocess using more CPU cores than allowed by -j option #365

leowill01 · 2024-01-24T23:00:14Z

I'm running multiple calls to breseq using GNU parallel but i only allow each breseq call to use 1 cpu core with -j 1.
however when looking at my process monitor i see that whenever breseq calls a subprocess step for bowtie2 it uses more than 1 cpu core:

ive logged this as bowtie2 using 200% CPU (ie 2 cores) when -j 1 and 300% cpu (3 cores) when -j 2.
interestingly, in the breseq output, it shows that every call to bowtie2 is called with -p 1 so im not sure why it would be trying to use more than 1 core.

this is causing problems when trying to efficiently schedule cores/job using parallel with my scripts because i assume that 1 core = 1 job, however when breseq/bowtie2 uses more than 1 core, this has been causing problems with CPU overhead and clogging up the threads.

anyone come across this before?

The text was updated successfully, but these errors were encountered:

jeffreybarrick · 2024-01-25T01:34:42Z

I haven't noticed this, but I also haven't paid close attention.

It seems like there might be some discussion of something related over on the bowtie2 issues...

BenLangmead/bowtie2#62

leowill01 · 2024-01-25T02:20:30Z

thanks for the find! seems that issue has been open quite a while. have there ever been any plans to incorporate a choice for the aligner (eg opting to use bwa-mem2 instead of bowtie2)?
ill keep testing to see if its a problem stemming from elsewhere like within parallel.

jeffreybarrick · 2024-01-25T02:50:48Z

It would be very difficult to substitute another aligner and get full breseq functionality.

In particular, the junction prediction steps require finding split read matches and breseq tracks all equivalent locations to which a read aligns. Not all aligners are good at doing these things. Most are optimized for finding the bast match and/or randomly assign a read to one equivalent location.

There is an option to use your own aligned SAM files of reads as input to breseq(--aligned-sam), in which case it will skip the alignment steps. But, it can't call JC evidence in this case, so you might as well use any other SNP / small indel calling program in this case. So, I wouldn't recommend going down that road.

$ breseq -h
...
 --aligned-sam                     Input files are aligned SAM files, rather than FASTQ
                                   files. Junction prediction steps will be skipped. Be
                                   aware that breseq assumes: (1) Your SAM file is
                                   sorted such that all alignments for a given read are
                                   on consecutive lines. You can use 'samtools sort -n'
                                   if you are not sure that this is true for the output
                                   of your alignment program. (2) You EITHER have
                                   alignment scores as additional SAM fields with the
                                   form 'AS:i:n', where n is a positive integer and
                                   higher values indicate a better alignment OR it
                                   defaults to calculating an alignment score that is
                                   equal to the number of bases in the read minus the
                                   number of inserted bases, deleted bases, and soft
                                   clipped bases in the alignment to the reference. The
                                   default highly penalizes split-read matches (with
                                   CIGAR strings such as M35D303M65).

I would have thought that disk read/write would be more limiting if you launch many breseq runs that hit the bowtie2 alignment step at the same time.

leowill01 · 2024-02-15T20:33:17Z

opened issue for bowtie2 here

jeffreybarrick closed this as completed Jan 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bowtie2 subprocess using more CPU cores than allowed by -j option #365

bowtie2 subprocess using more CPU cores than allowed by -j option #365

leowill01 commented Jan 24, 2024 •

edited

Loading

jeffreybarrick commented Jan 25, 2024

leowill01 commented Jan 25, 2024

jeffreybarrick commented Jan 25, 2024

leowill01 commented Feb 15, 2024

bowtie2 subprocess using more CPU cores than allowed by -j option #365

bowtie2 subprocess using more CPU cores than allowed by -j option #365

Comments

leowill01 commented Jan 24, 2024 • edited Loading

jeffreybarrick commented Jan 25, 2024

leowill01 commented Jan 25, 2024

jeffreybarrick commented Jan 25, 2024

leowill01 commented Feb 15, 2024

leowill01 commented Jan 24, 2024 •

edited

Loading