-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gzip: user_assembly.prodigal/mets_full/diamond/*.out: No such file or directory #302
Comments
I think those are warnings that we also see and that can be ignored. (I think it's tools it looks for but doesn't use when not found.)
My guess is that the download or the subsequent creation of the gtdb database failed. The default for the pipeline is to download to a directory called
[...] |
I tried to download tax-table.txt manually and it worked. The resulting file has 8MB. [lucia-zifcakova@deigo-login2 eukulele]$ wget https://www.dropbox.com/s/3vwo1r7cbm3tn35/tax-table.txt?dl=1Time finished was 2024-10-28 15:22:00.853667 tax-table.txt?dl=1Time.1 100%[============================================================================>] 7.96M 28.3MB/s in 0.3s 2024-10-31 11:35:45 (28.3 MB/s) - ‘tax-table.txt?dl=1Time.1’ saved [8346567/8346567] --2024-10-31 11:35:45-- http://finished/ |
I have found link for reference.pep.fa.gz file in work directory where the eukulele tmp files were dumped. I have used the first link from these links that were there: Resulting .fa.gz file was 38GB big but when I unzip it it was 97GB big. The file downloaded and unzipped by pipeline was 38GB big after unzipping... seems like unzipping is not working and just the file name is changed... but file is not unzipped. I still have no idea where to get prot-map.json and diamond files from... Can you advise me on that? |
I have run Eukulele within the singularity container provided by pipeline and it seems like URI::Escape is missing in its perl installation, which is causing problems with Transcoder. This is how I run Eukulele from inside the container: EUKulele and this is error message I received: 2024-11-01 09:04:08 (20.9 MB/s) - ‘TransDecoder-v5.5.0.tar.gz’ saved [15748671/15748671] Can't locate URI/Escape.pm in @inc (you may need to install the URI::Escape module) (@inc contains: /flash/MillerU/Vibrio_first_paper_data/references_bins/TransDecoder/PerlLib /usr/lib64/perl5/lib /usr/local/lib/site_perl/5.26.2/x86_64-linux-thread-multi /usr/local/lib/site_perl/5.26.2 /usr/local/lib/5.26.2/x86_64-linux-thread-multi /usr/local/lib/5.26.2 .) at /flash/MillerU/Vibrio_first_paper_data/references_bins/TransDecoder/PerlLib/Gene_obj.pm line 15. and this is how I checked for perl URI::Escape: perl -MURI::Escape -e 'print "URI::Escape is installed\n"' Can't locate URI/Escape.pm in @inc (you may need to install the URI::Escape module) (@inc contains: /usr/lib64/perl5/lib /usr/local/lib/site_perl/5.26.2/x86_64-linux-thread-multi /usr/local/lib/site_perl/5.26.2 /usr/local/lib/5.26.2/x86_64-linux-thread-multi /usr/local/lib/5.26.2 .). |
See my reply in the other issue you opened. I'm closing this as both regard the same problem. |
Description of the bug
even though I run pipeline from writable location on HPC cluster, it still complains about problems with installing Eukulele dependencies... I have checked and diamond is in the container depot.galaxyproject.org-singularity-eukulele-2.0.5--pyh723bec7_0.img (see attached picture).
How can I set up "MPLCONFIGDIR environment variable to a writable directory"? Will this solve the issue?
Command used and terminal output
nextflow run nf-core/metatdenovo -r dev -c /flash/MillerU/Vibrio_first_paper_data/nextflow.config -resume '[soggy_northcutt]' -profile oist --save_trimmed true --assembly /flash/MillerU/Vibrio_first_paper_data/results/spades/transcripts.fasta --skip_eggnog --eukulele_db gtdb --outdir /flash/MillerU/Vibrio_first_paper_data/results/ --input /flash/MillerU/Vibrio_first_paper_data/samples_for_first_vibrio_paper.csv
Error executing process > 'NFCORE_METATDENOVO:METATDENOVO:SUB_EUKULELE:EUKULELE_SEARCH (user_assembly.prodigal)'
Caused by:
Process
NFCORE_METATDENOVO:METATDENOVO:SUB_EUKULELE:EUKULELE_SEARCH (user_assembly.prodigal)
terminated with an error exit status (1)Command executed:
rc=0
mkdir contigs
gunzip -c user_assembly.prodigal.faa.gz > ./contigs/proteins.faa
EUKulele
-m mets
--database gtdb
--protein_extension .faa
--reference_dir eukulele
-o user_assembly.prodigal
--CPUs 12
-s
contigs || rc=$?
gzip user_assembly.prodigal/mets_full/diamond/.out
gzip user_assembly.prodigal/taxonomy_counts/.csv
gzip user_assembly.prodigal/taxonomy_estimation/*.out
cat <<-END_VERSIONS > versions.yml$(echo $ (EUKulele --version 2>&1) | sed -n 's/.* ([0-9]+.[0-9]+.[0-9]+).*/\1/p')
"NFCORE_METATDENOVO:METATDENOVO:SUB_EUKULELE:EUKULELE_SEARCH":
eukulele:
END_VERSIONS
if [ $rc -le 1 ]; then
exit 0
else
exit $rc;
fi
Command exit status:
1
Command output:
All reference files for GTDB downloaded to eukulele/gtdb
Running EUKulele with command line arguments, as no valid configuration file was provided.
Setting things up...
Could not successfully install all external dependent software.
Check DIAMOND, BLAST, BUSCO, and TransDecoder installation.
['proteins']
Specified reference directory, reference FASTA, and protein map/taxonomy table not found. Using database in location: eukulele/gtdb.
Automatically downloading database gtdb . If you intended to use an existing database folder, be sure a reference FASTA, protein map, and taxonomy table are provided. Check the documentation for details.
Command error:
5900K .......... .......... .......... .......... .......... 72% 24.5M 0s
5950K .......... .......... .......... .......... .......... 73% 34.4M 0s
6000K .......... .......... .......... .......... .......... 74% 18.1M 0s
6050K .......... .......... .......... .......... .......... 74% 35.1M 0s
6100K .......... .......... .......... .......... .......... 75% 772K 0s
6150K .......... .......... .......... .......... .......... 76% 29.2M 0s
6200K .......... .......... .......... .......... .......... 76% 43.1M 0s
6250K .......... .......... .......... .......... .......... 77% 60.4M 0s
6300K .......... .......... .......... .......... .......... 77% 31.3M 0s
6350K .......... .......... .......... .......... .......... 78% 41.1M 0s
6400K .......... .......... .......... .......... .......... 79% 30.4M 0s
6450K .......... .......... .......... .......... .......... 79% 35.7M 0s
6500K .......... .......... .......... .......... .......... 80% 39.3M 0s
6550K .......... .......... .......... .......... .......... 80% 80.5M 0s
6600K .......... .......... .......... .......... .......... 81% 31.4M 0s
6650K .......... .......... .......... .......... .......... 82% 35.4M 0s
6700K .......... .......... .......... .......... .......... 82% 51.5M 0s
6750K .......... .......... .......... .......... .......... 83% 39.6M 0s
6800K .......... .......... .......... .......... .......... 84% 33.2M 0s
6850K .......... .......... .......... .......... .......... 84% 58.2M 0s
6900K .......... .......... .......... .......... .......... 85% 33.3M 0s
6950K .......... .......... .......... .......... .......... 85% 51.1M 0s
7000K .......... .......... .......... .......... .......... 86% 66.1M 0s
7050K .......... .......... .......... .......... .......... 87% 34.5M 0s
7100K .......... .......... .......... .......... .......... 87% 34.8M 0s
7150K .......... .......... .......... .......... .......... 88% 42.5M 0s
7200K .......... .......... .......... .......... .......... 88% 50.6M 0s
7250K .......... .......... .......... .......... .......... 89% 64.2M 0s
7300K .......... .......... .......... .......... .......... 90% 35.7M 0s
7350K .......... .......... .......... .......... .......... 90% 37.8M 0s
7400K .......... .......... .......... .......... .......... 91% 40.2M 0s
7450K .......... .......... .......... .......... .......... 92% 43.3M 0s
7500K .......... .......... .......... .......... .......... 92% 93.6M 0s
7550K .......... .......... .......... .......... .......... 93% 35.3M 0s
7600K .......... .......... .......... .......... .......... 93% 36.4M 0s
7650K .......... .......... .......... .......... .......... 94% 42.4M 0s
7700K .......... .......... .......... .......... .......... 95% 47.2M 0s
7750K .......... .......... .......... .......... .......... 95% 60.5M 0s
7800K .......... .......... .......... .......... .......... 96% 56.4M 0s
7850K .......... .......... .......... .......... .......... 96% 36.5M 0s
7900K .......... .......... .......... .......... .......... 97% 43.9M 0s
7950K .......... .......... .......... .......... .......... 98% 50.7M 0s
8000K .......... .......... .......... .......... .......... 98% 68.0M 0s
8050K .......... .......... .......... .......... .......... 99% 44.1M 0s
8100K .......... .......... .......... .......... .......... 99% 24.5M 0s
8150K 100% 1.76T=1.4s
2024-10-30 16:16:54 (5.74 MB/s) - ‘eukulele/gtdb/taxonomy-table.txt’ saved [8346567/8346567]
gzip: user_assembly.prodigal/mets_full/diamond/*.out: No such file or directory
Work dir:
/flash/MillerU/Vibrio_first_paper_data/work/dd/fdbeda7dbd16acff65d2300e339a5d
Container:
/flash/MillerU/Vibrio_first_paper_data/work/singularity/depot.galaxyproject.org-singularity-eukulele-2.0.5--pyh723bec7_0.img
Tip: when you have fixed the problem you can continue the execution adding the option
-resume
to the run command lineRelevant files
.command.err file:
Matplotlib created a temporary config/cache directory at /scratch/matplotlib-m8atu3f8 because the default path (/home/l/lucia-zifcakova/.config/matplotlib) is not a writable directory; it is highly recommended to set t
he MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
--2024-10-30 15:50:07-- https://www.dropbox.com/s/dh839ah2hu0m2r4/reference.pep.fa.gz?dl=1
Resolving www.dropbox.com (www.dropbox.com)... 162.125.80.18, 2620:100:6035:18::a27d:5512
Connecting to www.dropbox.com (www.dropbox.com)|162.125.80.18|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://www.dropbox.com/scl/fi/qg9klsgyas9q7goc816n7/reference.pep.fa.gz?rlkey=v1l0emh7u68afz5apd0yw74wj&dl=1 [following]
--2024-10-30 15:50:07-- https://www.dropbox.com/scl/fi/qg9klsgyas9q7goc816n7/reference.pep.fa.gz?rlkey=v1l0emh7u68afz5apd0yw74wj&dl=1
Reusing existing connection to www.dropbox.com:443.
HTTP request sent, awaiting response... 302 Found
Location: https://ucf741995026801e6bf2d677002c.dl.dropboxusercontent.com/cd/0/inline/CdYlIGQAsYZIuMKiX3Rxd5O7vj6jDHu4WeUhTcXg9_wiemd1IogI53uFWRV6RAzldGm9Ro2OaNz_EAb9O9MX4i6mq2Qw1C0MyxbhsuGpNNmySpy3tYsgHi7lm6vG-bSoreHRy
IY_6ve8AzvHHdvPNDU7/file?dl=1# [following]
--2024-10-30 15:50:08-- https://ucf741995026801e6bf2d677002c.dl.dropboxusercontent.com/cd/0/inline/CdYlIGQAsYZIuMKiX3Rxd5O7vj6jDHu4WeUhTcXg9_wiemd1IogI53uFWRV6RAzldGm9Ro2OaNz_EAb9O9MX4i6mq2Qw1C0MyxbhsuGpNNmySpy3tYsgHi
7lm6vG-bSoreHRyIY_6ve8AzvHHdvPNDU7/file?dl=1
Resolving ucf741995026801e6bf2d677002c.dl.dropboxusercontent.com (ucf741995026801e6bf2d677002c.dl.dropboxusercontent.com)... 162.125.80.15, 2620:100:6035:15::a27d:550f
Connecting to ucf741995026801e6bf2d677002c.dl.dropboxusercontent.com (ucf741995026801e6bf2d677002c.dl.dropboxusercontent.com)|162.125.80.15|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: /cd/0/inline2/CdY63h-sUe-n6-gdA3xNrO0_dNRqWPRdGKaMqbW6x9Og961Z9mtFv2dgSCt-d2LLiYw1TRwhzK69k5l4cuYZV4eckCkpxSHt5g5z-M29wOP64Liqr9CIqkbzbL4vnGFn7OksrLkmksY_FX815U8lw-FiljbhSRy6qQVsv5EOaC8XZR6oPzl8_9ggdsafA4vsR7
QXNs6PCsUvjl7_8_T7aock4dQyIS8HAw92bW2yplcU5Yu88gDxBYUjvroWa_B2r-5eZ6MF60OF2S5awuxj7tpgt6bruv8zIFzXyl6VSpLjPqZuSpKbKvTSRyPsDMkGpVPC-qbDKlPXas2sI7fiE8RADrAuRNnKcevjQREPk9DqKGh4wawEV8T6TJWHg_Xk7Zo/file?dl=1 [following]
--2024-10-30 15:50:10-- https://ucf741995026801e6bf2d677002c.dl.dropboxusercontent.com/cd/0/inline2/CdY63h-sUe-n6-gdA3xNrO0_dNRqWPRdGKaMqbW6x9Og961Z9mtFv2dgSCt-d2LLiYw1TRwhzK69k5l4cuYZV4eckCkpxSHt5g5z-M29wOP64Liqr9CIq
kbzbL4vnGFn7OksrLkmksY_FX815U8lw-FiljbhSRy6qQVsv5EOaC8XZR6oPzl8_9ggdsafA4vsR7QXNs6PCsUvjl7_8_T7aock4dQyIS8HAw92bW2yplcU5Yu88gDxBYUjvroWa_B2r-5eZ6MF60OF2S5awuxj7tpgt6bruv8zIFzXyl6VSpLjPqZuSpKbKvTSRyPsDMkGpVPC-qbDKlPXas2
sI7fiE8RADrAuRNnKcevjQREPk9DqKGh4wawEV8T6TJWHg_Xk7Zo/file?dl=1
Reusing existing connection to ucf741995026801e6bf2d677002c.dl.dropboxusercontent.com:443.
HTTP request sent, awaiting response... 200 OK
Length: 40167558722 (37G) [application/binary]
Saving to: ‘eukulele/gtdb/reference.pep.fa’
System information
N E X T F L O W ~ version 24.10.0
HPC
slurm
Singularity
CentOS Linux
metatdenovo dev
The text was updated successfully, but these errors were encountered: