You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Seeing some issues when working with very large data sets (~1000 samples or more, over 1M ASVs), where the simple text outputs take a long time. This primarily is when prepping for QIIME2:
# Generate OTU table output (rows = samples, cols = ASV)
One workaround is to simply generate default outputs (seq tables and tax tables for phyloseq) but time out for other data, but this will require splitting out those steps, currently found in GenerateSeqTables.R and GenerateTaxTables.R.
The text was updated successfully, but these errors were encountered:
The main culprit is really the seq table and the number of samples. With a current run we have a matrix of 960 sample IDs x 1.3M ASVs (with counts). The tax table with 1.3M ASVs and seven ranks (KPCOPGS) is relatively fast.
There is a bit of redundancy in GenerateSeqTables.R that should also be addressed, namely that seqtab_final.txt and seqtab_final.simple.txt are the same file; this likely occurs from some code rework that we when renaming ASVs. We can wait to address this when the split_denoise branch lands.
Seeing some issues when working with very large data sets (~1000 samples or more, over 1M ASVs), where the simple text outputs take a long time. This primarily is when prepping for QIIME2:
TADA/templates/GenerateSeqTables.R
Line 29 in d0ee9aa
or generating a new seq table with the modified IDs:
TADA/templates/GenerateSeqTables.R
Line 22 in d0ee9aa
One workaround is to simply generate default outputs (seq tables and tax tables for phyloseq) but time out for other data, but this will require splitting out those steps, currently found in
GenerateSeqTables.R
andGenerateTaxTables.R
.The text was updated successfully, but these errors were encountered: