-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Foldmason not running on large set of proteins #11
Comments
Do you also get the same behaviour with pre-clustering disabled ( |
Yes, I see the same behavior with precluster disabled.
Rasna
"Success never rests. On your bad days, be good. On your good days, be great. And on every other day, get better."
…On Sep 4, 2024 at 8:44 PM -0500, Cameron Gilchrist ***@***.***>, wrote:
Do you also get the same behaviour with pre-clustering disabled (--precluster 0)?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
How did you build the container? I just realized that we have not been automatically building containers. |
@gamcil commited a fix earlier today. Could you check if the issue is still happening for you? You can download precompiled binaries at https://mmseqs.com/foldmason. |
I used the Dockerfile provided in the repository |
Running this now. It seems to have gone further than before. I removed |
Confirmed that running with the precompiled binaries and using the command: |
Were you getting segfaults also with |
It seems like the problem is |
Would it be possible to share the inputs so that we can try to reproduce the new issue? |
I tried it again with --report-mode 1 and it seems to be working. Thank you for the assistance. I'll reach out in case I face any problems. |
Expected Behavior
I am running foldmason with the command below:
easy-msa /workspace/protein/structs /workspace/results_foldmason/protein/result /workspace/results_foldmason/protein/tmpFolder --report-mode 1 --precluster --max-seq-len 4000
I have about 2000 proteins of approx length 280 amino acids.
Current Behavior
I am getting memory errors.
Steps to Reproduce (for bugs)
Just run easy-msa on a large set of sequences.
Foldseek Output (for bugs)
I get the output below (last few lines):
Size of the sequence database: 3588
Size of the alignment database: 3588
Number of clusters: 1487
Writing results 0h 0m 0s 0ms
Time for merging to clu: 0h 0m 0s 428ms
Time for processing: 0h 0m 36s 725ms
Error: structuremsa died
Segmentation fault (core dumped)
Context
The --max-seq-len parameter doesn't seem to make a difference. I'm still getting the memory error.
Your Environment
I've been running foldmason via the docker image created from the dockerfile. I am running on a kubernetes cluster and provide 64Gb of RAM, and 6 cpus.
The text was updated successfully, but these errors were encountered: