Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create GTDB DNA Taxonomy database in mmseq2 #884

Open
feixiang1209 opened this issue Aug 30, 2024 · 1 comment
Open

Create GTDB DNA Taxonomy database in mmseq2 #884

feixiang1209 opened this issue Aug 30, 2024 · 1 comment

Comments

@feixiang1209
Copy link

I would like to create a GTDB DNA Taxonomy database in mmseq2. However, in the manual, it can only create the GTDB aminoacid databas. Could you please advise how I can create the DNA taxonomy database? Which files from GTDB should I download?

Thanks a lot

@feixiang1209
Copy link
Author

What i did was (1) Downloaded gtdb_genomes_reps.tar.gz and then combined all the fa files into one fa file. (2) downloaded ar53_taxonomy.tsv and bac120_taxonomy.tsv, also combined them into one tsv. (3) mmseqs createdb combined.fa gtdb_seqs (4) mmseqs createtaxdb gtdb_seqs tmp --tax-mapping-file combined.tsv --threads 50.

There was no error showed in the whole process. However, when I tried to use "mmseqs taxonomy" function to seach one contig.fa file, all the contigs were unclassified. Could you please advise where I did wrongly?

Thanks a lot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant