Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long runtime with Circle Realign? #66

Open
hartlama opened this issue Mar 26, 2021 · 5 comments
Open

Long runtime with Circle Realign? #66

hartlama opened this issue Mar 26, 2021 · 5 comments

Comments

@hartlama
Copy link

Is there any way to speed up Circle Realign? I'm running with 10 processors, 15GB each, on a server, but I still have some iterations taking 1-3 hours. My sorted_read_candidates.bam file is 16GB. Any suggestions?

@iprada
Copy link
Owner

iprada commented Mar 28, 2021

Can you share the exact command you are using and how you submit it (In case you are on cluster)?

best,

Inigo

@hartlama
Copy link
Author

hartlama commented Mar 29, 2021

#SBATCH --job-name=realign
#SBATCH --mail-type=BEGIN,END,FAIL,REQUEUE
#SBATCH --nodes=1
#SBATCH --mem-per-cpu=15G
#SBATCH --time=14-00:00:00
#SBATCH --account=mariacas99
#SBATCH --partition=standard
#SBATCH --cpus-per-task=10

Circle-Map Realign -t 10 -v 3 -i sorted_circular_read_candidates_NPAH.bam -sbam NPAH_sorted_coord.bam -qbam NPAH_sorted_qname.bam -fasta /mm10/mm10.fa -o /NPAH_circle.bed

@iprada
Copy link
Owner

iprada commented Mar 30, 2021

Dear hartlama,

I see. Is your data phi29 amplified?

If so, the first interations tend to be slow due to exponential amplification and then it gets faster.

On the long run, I would like to make Circle-Map a bit faster, and I have already found the lines of code that make it perform poorly. But I a afraid this can take a very long time given the amount of work I have now.

best,

Inigo

@hartlama
Copy link
Author

hartlama commented Mar 30, 2021

Hi Inigo,

It is amplified. Would it be safe to down-sample at one of the previous steps?

Thanks,
Molly

@njaupan
Copy link

njaupan commented Jun 28, 2021

Hi,
I ran into the same problem with Circle Realign. I tested it on a plant genome with a genome size of 17G. Run the command as follows on a cluster with 96 CPUs and 496 GB of memory:

Circle-Map Realign -t 80 -i $out.bwa.qname.Circle-Map.sort.bam -qbam $out.bwa.qname.bam -sbam $out.bwa.bam -fasta $reference -o $out. bwa.qname.Circle-Map.out.bed

However, it seems to run forever at the step of 100%, any suggestions?

2021-06-27 23:28:37: Splitting clusters to to processors

32%|███▎ | 1300/4000 [04:35<08:22, 5.37it/s]
33%|███▎ | 1303/4000 [04:36<08:45, 5.13it/s]
43%|████▎ | 1713/4000 [06:02<07:24, 5.14it/s]
4000it [17:36, 3.78it/s]0 [17:36<00:00, 2.75it/s]
100%|██████████| 4000/4000 [17:36<00:00, 3.78it/s]

the best,
Panpan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants