Container size effects? #12

mittimithai · 2022-09-26T15:35:34Z

I think you can see some unfortunate silent behavior with different default container sizes. I thought that, by now, most docker installations have basesize set to 100GB, but installing on my mac through brew I think it is 10GB (querying the basesize is different across platforms seems to be a bit challenging). In building the database for homo sapiens I get a no space left error:

/bin/cat: write error: No space left on device

This is fairly obvious but in running end_to_end, I've seen different success rates with 10GB and 100GB sizes (assuming I am querying basesize correctly):

-Container size 100G
Number of designs excluded because the maximum of designs per exon was exceeded = 7103
Number of designs excluded because their nucleotide composition was too invariable or contained TTTTT = 2611081
Number of designs excluded because they did not hit any exon = 24333814
Number of designs excluded because they did not hit any gene = 955244
Number of designs excluded because they hit multiple targets or none = 113366
Number of designs excluded because they were not located in a coding sequence = 855712
Number of designs that hit a specific target = 1231650
Number of successful designs = 1224547
Number of total possible designs = 30100867
…
130 genes are missing because of to harsh design criteria or because they were not found by CLD in the data base.

-Container size 10G (Success rate 44%)
Number of designs excluded because the maximum of designs per exon was exceeded = 2726
Number of designs excluded because their nucleotide composition was too invariable or contained TTTTT = 2611081
Number of designs excluded because they did not hit any exon = 24333814
Number of designs excluded because they did not hit any gene = 955244
Number of designs excluded because they hit multiple targets or none = 48413
Number of designs excluded because they were not located in a coding sequence = 855712
Number of designs that hit a specific target = 516467
Number of successful designs = 513741
Number of total possible designs = 30100867
…
3558 genes are missing because of to harsh design criteria or because they were not found by CLD in the data base.

This seems to silent failure that manages to show up in final stats?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Container size effects? #12

Container size effects? #12

mittimithai commented Sep 26, 2022

Container size effects? #12

Container size effects? #12

Comments

mittimithai commented Sep 26, 2022