Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container size effects? #12

Open
mittimithai opened this issue Sep 26, 2022 · 0 comments
Open

Container size effects? #12

mittimithai opened this issue Sep 26, 2022 · 0 comments

Comments

@mittimithai
Copy link

I think you can see some unfortunate silent behavior with different default container sizes. I thought that, by now, most docker installations have basesize set to 100GB, but installing on my mac through brew I think it is 10GB (querying the basesize is different across platforms seems to be a bit challenging). In building the database for homo sapiens I get a no space left error:

/bin/cat: write error: No space left on device

This is fairly obvious but in running end_to_end, I've seen different success rates with 10GB and 100GB sizes (assuming I am querying basesize correctly):

-Container size 100G
Number of designs excluded because the maximum of designs per exon was exceeded = 7103
Number of designs excluded because their nucleotide composition was too invariable or contained TTTTT = 2611081
Number of designs excluded because they did not hit any exon = 24333814
Number of designs excluded because they did not hit any gene = 955244
Number of designs excluded because they hit multiple targets or none = 113366
Number of designs excluded because they were not located in a coding sequence = 855712
Number of designs that hit a specific target = 1231650
Number of successful designs = 1224547
Number of total possible designs = 30100867

130 genes are missing because of to harsh design criteria or because they were not found by CLD in the data base.

-Container size 10G (Success rate 44%)
Number of designs excluded because the maximum of designs per exon was exceeded = 2726
Number of designs excluded because their nucleotide composition was too invariable or contained TTTTT = 2611081
Number of designs excluded because they did not hit any exon = 24333814
Number of designs excluded because they did not hit any gene = 955244
Number of designs excluded because they hit multiple targets or none = 48413
Number of designs excluded because they were not located in a coding sequence = 855712
Number of designs that hit a specific target = 516467
Number of successful designs = 513741
Number of total possible designs = 30100867

3558 genes are missing because of to harsh design criteria or because they were not found by CLD in the data base.

This seems to silent failure that manages to show up in final stats?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant