Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors running decode.sh #1662

Open
samjenks opened this issue Nov 20, 2024 · 5 comments
Open

Errors running decode.sh #1662

samjenks opened this issue Nov 20, 2024 · 5 comments

Comments

@samjenks
Copy link

samjenks commented Nov 20, 2024

I'm trying to follow the steps here https://alphacephei.com/vosk/lm to update the vocab dictionary with some new words. My compile-graph.sh ran without errors. However, when running decode.sh I keep getting run.pl: job failed outputs followed by bash: line 1: 1324776 Killed outputs. How do I determine if this succeeded, or something went wrong?

Full output:
./decode.sh
steps/make_mfcc.sh --nj 10 data/test_tedlium exp/make_mfcc/test mfcc
utils/validate_data_dir.sh: Successfully validated data-directory data/test_tedlium
steps/make_mfcc.sh [info]: segments file exists: using that.
steps/make_mfcc.sh: Succeeded creating MFCC features for test_tedlium
steps/compute_cmvn_stats.sh data/test_tedlium exp/make_mfcc/test mfcc
Succeeded creating CMVN stats for test_tedlium
fix_data_dir.sh: kept all 1155 utterances.
fix_data_dir.sh: old files are kept in data/test_tedlium/.backup
steps/online/nnet2/extract_ivectors_online.sh --nj 4 --use-vad false data/test_tedlium exp/chain/extractor exp/chain/ivectors_test_tedlium
filter_scps.pl: warning: some input lines were output to multiple files [OK if splitting per utt]
filter_scps.pl: warning: some input lines were output to multiple files [OK if splitting per utt]
steps/online/nnet2/extract_ivectors_online.sh: extracting iVectors
steps/online/nnet2/extract_ivectors_online.sh: combining iVectors across jobs
steps/online/nnet2/extract_ivectors_online.sh: done extracting (online) iVectors to exp/chain/ivectors_test_tedlium using the extractor in exp/chain/extractor.
steps/nnet3/decode.sh --cmd run.pl --num-threads 1 --nj 10 --online-ivector-dir exp/chain/ivectors_test_tedlium --acwt 1.0 --post-decode-acwt 10.0 exp/chain/tdnn/graph data/test_tedlium exp/chain/tdnn/decode_test_tedlium
steps/nnet2/check_ivectors_compatible.sh: WARNING: One of the directories do not contain iVector ID.
steps/nnet2/check_ivectors_compatible.sh: WARNING: That means it's you who's reponsible for keeping
steps/nnet2/check_ivectors_compatible.sh: WARNING: the directories compatible
filter_scps.pl: warning: some input lines were output to multiple files [OK if splitting per utt]
filter_scps.pl: warning: some input lines were output to multiple files [OK if splitting per utt]
steps/nnet3/decode.sh: feature type is raw
steps/diagnostic/analyze_lats.sh --cmd run.pl --iter final exp/chain/tdnn/graph exp/chain/tdnn/decode_test_tedlium
run.pl: job failed, log is in exp/chain/tdnn/decode_test_tedlium/log/analyze_alignments.log
score best paths
local/score.sh --cmd run.pl data/test_tedlium exp/chain/tdnn/graph exp/chain/tdnn/decode_test_tedlium
local/score.sh: scoring with word insertion penalty=0.0,0.5,1.0
score confidence and timing with sclite
Decoding done.
steps/lmrescore_const_arpa.sh data/lang_test data/lang_test_rescore data/test_tedlium exp/chain/tdnn/decode_test_tedlium exp/chain/tdnn/decode_test_tedlium_rescore
steps/lmrescore_const_arpa.sh: Missing file data/lang_test_rescore/G.carpa
rnnlm/lmrescore_pruned.sh --lattice-prune-beam 4 --max-ngram-order 4 data/lang_test exp/rnnlm_out data/test_tedlium exp/chain/tdnn/decode_test_tedlium_rescore exp/chain/tdnn/decode_test_tedlium_rnnlm
bash: line 1: 1324776 Killed ( lattice-lmrescore-kaldi-rnnlm-pruned --lm-scale=0.5 --bos-symbol=312334 --eos-symbol=312335 --brk-symbol=312336 --lattice-compose-beam=4 --acoustic-scale=0.1 --max-ngram-order=4 data/lang_test/G.fst exp/rnnlm_out/word_embedding.final.mat exp/rnnlm_out/final.raw "ark:gunzip -c exp/chain/tdnn/decode_test_tedlium_rescore/lat.3.gz|" "ark,t:|gzip -c>exp/chain/tdnn/decode_test_tedlium_rnnlm/lat.3.gz" ) 2>> exp/chain/tdnn/decode_test_tedlium_rnnlm/log/rescorelm.3.log >> exp/chain/tdnn/decode_test_tedlium_rnnlm/log/rescorelm.3.log
run.pl: 3 / 10 failed, log is in exp/chain/tdnn/decode_test_tedlium_rnnlm/log/rescorelm.*.log
steps/nnet3/decode_lookahead.sh --cmd run.pl --nj 10 --online-ivector-dir exp/chain/ivectors_test_tedlium --acwt 1.0 --post-decode-acwt 10.0 exp/chain/tdnn/lgraph data/test_tedlium exp/chain/tdnn/decode_test_tedlium_l
steps/nnet2/check_ivectors_compatible.sh: WARNING: One of the directories do not contain iVector ID.
steps/nnet2/check_ivectors_compatible.sh: WARNING: That means it's you who's reponsible for keeping
steps/nnet2/check_ivectors_compatible.sh: WARNING: the directories compatible
steps/nnet3/decode_lookahead.sh: feature type is raw
steps/diagnostic/analyze_lats.sh --cmd run.pl --iter final exp/chain/tdnn/lgraph exp/chain/tdnn/decode_test_tedlium_l
run.pl: job failed, log is in exp/chain/tdnn/decode_test_tedlium_l/log/analyze_alignments.log
score best paths
local/score.sh --cmd run.pl data/test_tedlium exp/chain/tdnn/lgraph exp/chain/tdnn/decode_test_tedlium_l
local/score.sh: scoring with word insertion penalty=0.0,0.5,1.0
score confidence and timing with sclite
Decoding done.

@nshmyrev
Copy link
Collaborator

Killed means not enough memory. Usually we use like 64Gb to compile

@samjenks
Copy link
Author

I'm definitely below that. before I go buy more RAM are there any compute requirements as well?

@samjenks samjenks reopened this Nov 20, 2024
@nshmyrev
Copy link
Collaborator

You can use less jobs for decoding, set -nj 1, it will probably fit. How much RAM do you have?

RNNLM rescoring is not critical either, you can skip it.

@samjenks
Copy link
Author

16 GBs. What flag turns off RNNLM?

@samjenks
Copy link
Author

samjenks commented Nov 21, 2024

I commented out the RNNLM and set the -nj 1 in the decode.sh script. That got rid of the "Bash Killed" outputs but not the run.pl: Job failed outputs.

Full Output Below:

steps/make_mfcc.sh --nj 1 data/test_tedlium exp/make_mfcc/test mfcc
steps/make_mfcc.sh: moving data/test_tedlium/feats.scp to data/test_tedlium/.backup
utils/validate_data_dir.sh: Successfully validated data-directory data/test_tedlium
steps/make_mfcc.sh [info]: segments file exists: using that.
steps/make_mfcc.sh: Succeeded creating MFCC features for test_tedlium
steps/compute_cmvn_stats.sh data/test_tedlium exp/make_mfcc/test mfcc
Succeeded creating CMVN stats for test_tedlium
fix_data_dir.sh: kept all 1155 utterances.
fix_data_dir.sh: old files are kept in data/test_tedlium/.backup
steps/online/nnet2/extract_ivectors_online.sh --nj 1 --use-vad false data/test_tedlium exp/chain/extractor exp/chain/ivectors_test_tedlium
steps/online/nnet2/extract_ivectors_online.sh: extracting iVectors
steps/online/nnet2/extract_ivectors_online.sh: combining iVectors across jobs
steps/online/nnet2/extract_ivectors_online.sh: done extracting (online) iVectors to exp/chain/ivectors_test_tedlium using the extractor in exp/chain/extractor.
steps/nnet3/decode.sh --cmd run.pl --num-threads 1 --nj 1 --online-ivector-dir exp/chain/ivectors_test_tedlium --acwt 1.0 --post-decode-acwt 10.0 exp/chain/tdnn/graph data/test_tedlium exp/chain/tdnn/decode_test_tedlium
steps/nnet2/check_ivectors_compatible.sh: WARNING: One of the directories do not contain iVector ID.
steps/nnet2/check_ivectors_compatible.sh: WARNING: That means it's you who's reponsible for keeping
steps/nnet2/check_ivectors_compatible.sh: WARNING: the directories compatible
steps/nnet3/decode.sh: feature type is raw
steps/diagnostic/analyze_lats.sh --cmd run.pl --iter final exp/chain/tdnn/graph exp/chain/tdnn/decode_test_tedlium run.pl: job failed, log is in exp/chain/tdnn/decode_test_tedlium/log/analyze_alignments.log
score best paths
local/score.sh --cmd run.pl data/test_tedlium exp/chain/tdnn/graph exp/chain/tdnn/decode_test_tedlium
local/score.sh: scoring with word insertion penalty=0.0,0.5,1.0
score confidence and timing with sclite
Decoding done.
steps/lmrescore_const_arpa.sh data/lang_test data/lang_test_rescore data/test_tedlium exp/chain/tdnn/decode_test_tedlium exp/chain/tdnn/decode_test_tedlium_rescore
steps/lmrescore_const_arpa.sh: Missing file data/lang_test_rescore/G.carpa
steps/nnet3/decode_lookahead.sh --cmd run.pl --nj 1 --online-ivector-dir exp/chain/ivectors_test_tedlium --acwt 1.0 --post-decode-acwt 10.0 exp/chain/tdnn/lgraph data/test_tedlium exp/chain/tdnn/decode_test_tedlium_l
steps/nnet2/check_ivectors_compatible.sh: WARNING: One of the directories do not contain iVector ID.
steps/nnet2/check_ivectors_compatible.sh: WARNING: That means it's you who's reponsible for keeping
steps/nnet2/check_ivectors_compatible.sh: WARNING: the directories compatible
steps/nnet3/decode_lookahead.sh: feature type is raw
steps/diagnostic/analyze_lats.sh --cmd run.pl --iter final exp/chain/tdnn/lgraph exp/chain/tdnn/decode_test_tedlium_l run.pl: job failed, log is in exp/chain/tdnn/decode_test_tedlium_l/log/analyze_alignments.log
score best paths
local/score.sh --cmd run.pl data/test_tedlium exp/chain/tdnn/lgraph exp/chain/tdnn/decode_test_tedlium_l
local/score.sh: scoring with word insertion penalty=0.0,0.5,1.0
score confidence and timing with sclite
Decoding done.

I dug into the log files mentioned, they both are trying to use a 'python' in /usr/bin/env. Which is not where my system python is, how do I point it to a different path?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants