Foldmason not running on large set of proteins #11

rrw1007 · 2024-09-04T21:28:35Z

Expected Behavior

I am running foldmason with the command below:
easy-msa /workspace/protein/structs /workspace/results_foldmason/protein/result /workspace/results_foldmason/protein/tmpFolder --report-mode 1 --precluster --max-seq-len 4000

I have about 2000 proteins of approx length 280 amino acids.

Current Behavior

I am getting memory errors.

Steps to Reproduce (for bugs)

Just run easy-msa on a large set of sequences.

Foldseek Output (for bugs)

I get the output below (last few lines):

Size of the sequence database: 3588
Size of the alignment database: 3588
Number of clusters: 1487

Writing results 0h 0m 0s 0ms
Time for merging to clu: 0h 0m 0s 428ms
Time for processing: 0h 0m 36s 725ms
Error: structuremsa died
Segmentation fault (core dumped)

Context

The --max-seq-len parameter doesn't seem to make a difference. I'm still getting the memory error.

Your Environment

I've been running foldmason via the docker image created from the dockerfile. I am running on a kubernetes cluster and provide 64Gb of RAM, and 6 cpus.

gamcil · 2024-09-05T01:44:04Z

Do you also get the same behaviour with pre-clustering disabled (--precluster 0)?

rrw1007 · 2024-09-05T01:49:06Z

Yes, I see the same behavior with precluster disabled. Rasna "Success never rests. On your bad days, be good. On your good days, be great. And on every other day, get better."

…

On Sep 4, 2024 at 8:44 PM -0500, Cameron Gilchrist ***@***.***>, wrote: Do you also get the same behaviour with pre-clustering disabled (--precluster 0)? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

milot-mirdita · 2024-09-05T02:09:11Z

How did you build the container? I just realized that we have not been automatically building containers.

milot-mirdita · 2024-09-05T14:31:18Z

@gamcil commited a fix earlier today. Could you check if the issue is still happening for you? You can download precompiled binaries at https://mmseqs.com/foldmason.

rrw1007 · 2024-09-05T15:23:18Z

How did you build the container? I just realized that we have not been automatically building containers.

I used the Dockerfile provided in the repository

rrw1007 · 2024-09-05T15:42:10Z

@gamcil commited a fix earlier today. Could you check if the issue is still happening for you? You can download precompiled binaries at https://mmseqs.com/foldmason.

Running this now. It seems to have gone further than before. I removed --report-mode 1. Could that be causing the issue?

rrw1007 · 2024-09-05T15:56:36Z

@gamcil commited a fix earlier today. Could you check if the issue is still happening for you? You can download precompiled binaries at https://mmseqs.com/foldmason.

Running this now. It seems to have gone further than before. I removed --report-mode 1. Could that be causing the issue?

Confirmed that running with the precompiled binaries and using the command:
./foldmason easy-msa /workspace/protein/structs /workspace/results_foldmason/protein/result /workspace/results_foldmason/protein/tmpFolder --precluster ran without any errors. So the issue might be including the --report-mode 1 parameter.

gamcil · 2024-09-06T01:31:14Z

Were you getting segfaults also with --report-mode 1?

rrw1007 · 2024-09-06T15:33:01Z

Were you getting segfaults also with --report-mode 1?

It seems like the problem is --report-mode 1. If I remove that and run the command, I don't get segfaults. If I include it, I get segfaults.

milot-mirdita · 2024-09-06T16:53:31Z

Would it be possible to share the inputs so that we can try to reproduce the new issue?

rrw1007 · 2024-09-06T22:10:34Z

Would it be possible to share the inputs so that we can try to reproduce the new issue?

I tried it again with --report-mode 1 and it seems to be working. Thank you for the assistance. I'll reach out in case I face any problems.

rrw1007 changed the title ~~Foldmason not running on set of 2076 proteins~~ Foldmason not running on large set of proteins Sep 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Foldmason not running on large set of proteins #11

Foldmason not running on large set of proteins #11

rrw1007 commented Sep 4, 2024 •

edited

Loading

gamcil commented Sep 5, 2024

rrw1007 commented Sep 5, 2024 via email

milot-mirdita commented Sep 5, 2024

milot-mirdita commented Sep 5, 2024

rrw1007 commented Sep 5, 2024

rrw1007 commented Sep 5, 2024 •

edited

Loading

rrw1007 commented Sep 5, 2024

gamcil commented Sep 6, 2024

rrw1007 commented Sep 6, 2024 •

edited

Loading

milot-mirdita commented Sep 6, 2024

rrw1007 commented Sep 6, 2024

Foldmason not running on large set of proteins #11

Foldmason not running on large set of proteins #11

Comments

rrw1007 commented Sep 4, 2024 • edited Loading

Expected Behavior

Current Behavior

Steps to Reproduce (for bugs)

Foldseek Output (for bugs)

Context

Your Environment

gamcil commented Sep 5, 2024

rrw1007 commented Sep 5, 2024 via email

milot-mirdita commented Sep 5, 2024

milot-mirdita commented Sep 5, 2024

rrw1007 commented Sep 5, 2024

rrw1007 commented Sep 5, 2024 • edited Loading

rrw1007 commented Sep 5, 2024

gamcil commented Sep 6, 2024

rrw1007 commented Sep 6, 2024 • edited Loading

milot-mirdita commented Sep 6, 2024

rrw1007 commented Sep 6, 2024

rrw1007 commented Sep 4, 2024 •

edited

Loading

rrw1007 commented Sep 5, 2024 •

edited

Loading

rrw1007 commented Sep 6, 2024 •

edited

Loading