Errors running decode.sh #1662

samjenks · 2024-11-20T18:33:44Z

I'm trying to follow the steps here https://alphacephei.com/vosk/lm to update the vocab dictionary with some new words. My compile-graph.sh ran without errors. However, when running decode.sh I keep getting run.pl: job failed outputs followed by bash: line 1: 1324776 Killed outputs. How do I determine if this succeeded, or something went wrong?

Full output:
./decode.sh
steps/make_mfcc.sh --nj 10 data/test_tedlium exp/make_mfcc/test mfcc
utils/validate_data_dir.sh: Successfully validated data-directory data/test_tedlium
steps/make_mfcc.sh [info]: segments file exists: using that.
steps/make_mfcc.sh: Succeeded creating MFCC features for test_tedlium
steps/compute_cmvn_stats.sh data/test_tedlium exp/make_mfcc/test mfcc
Succeeded creating CMVN stats for test_tedlium
fix_data_dir.sh: kept all 1155 utterances.
fix_data_dir.sh: old files are kept in data/test_tedlium/.backup
steps/online/nnet2/extract_ivectors_online.sh --nj 4 --use-vad false data/test_tedlium exp/chain/extractor exp/chain/ivectors_test_tedlium
filter_scps.pl: warning: some input lines were output to multiple files [OK if splitting per utt]
filter_scps.pl: warning: some input lines were output to multiple files [OK if splitting per utt]
steps/online/nnet2/extract_ivectors_online.sh: extracting iVectors
steps/online/nnet2/extract_ivectors_online.sh: combining iVectors across jobs
steps/online/nnet2/extract_ivectors_online.sh: done extracting (online) iVectors to exp/chain/ivectors_test_tedlium using the extractor in exp/chain/extractor.
steps/nnet3/decode.sh --cmd run.pl --num-threads 1 --nj 10 --online-ivector-dir exp/chain/ivectors_test_tedlium --acwt 1.0 --post-decode-acwt 10.0 exp/chain/tdnn/graph data/test_tedlium exp/chain/tdnn/decode_test_tedlium
steps/nnet2/check_ivectors_compatible.sh: WARNING: One of the directories do not contain iVector ID.
steps/nnet2/check_ivectors_compatible.sh: WARNING: That means it's you who's reponsible for keeping
steps/nnet2/check_ivectors_compatible.sh: WARNING: the directories compatible
filter_scps.pl: warning: some input lines were output to multiple files [OK if splitting per utt]
filter_scps.pl: warning: some input lines were output to multiple files [OK if splitting per utt]
steps/nnet3/decode.sh: feature type is raw
steps/diagnostic/analyze_lats.sh --cmd run.pl --iter final exp/chain/tdnn/graph exp/chain/tdnn/decode_test_tedlium
run.pl: job failed, log is in exp/chain/tdnn/decode_test_tedlium/log/analyze_alignments.log
score best paths
local/score.sh --cmd run.pl data/test_tedlium exp/chain/tdnn/graph exp/chain/tdnn/decode_test_tedlium
local/score.sh: scoring with word insertion penalty=0.0,0.5,1.0
score confidence and timing with sclite
Decoding done.
steps/lmrescore_const_arpa.sh data/lang_test data/lang_test_rescore data/test_tedlium exp/chain/tdnn/decode_test_tedlium exp/chain/tdnn/decode_test_tedlium_rescore
steps/lmrescore_const_arpa.sh: Missing file data/lang_test_rescore/G.carpa
rnnlm/lmrescore_pruned.sh --lattice-prune-beam 4 --max-ngram-order 4 data/lang_test exp/rnnlm_out data/test_tedlium exp/chain/tdnn/decode_test_tedlium_rescore exp/chain/tdnn/decode_test_tedlium_rnnlm
bash: line 1: 1324776 Killed ( lattice-lmrescore-kaldi-rnnlm-pruned --lm-scale=0.5 --bos-symbol=312334 --eos-symbol=312335 --brk-symbol=312336 --lattice-compose-beam=4 --acoustic-scale=0.1 --max-ngram-order=4 data/lang_test/G.fst exp/rnnlm_out/word_embedding.final.mat exp/rnnlm_out/final.raw "ark:gunzip -c exp/chain/tdnn/decode_test_tedlium_rescore/lat.3.gz|" "ark,t:|gzip -c>exp/chain/tdnn/decode_test_tedlium_rnnlm/lat.3.gz" ) 2>> exp/chain/tdnn/decode_test_tedlium_rnnlm/log/rescorelm.3.log >> exp/chain/tdnn/decode_test_tedlium_rnnlm/log/rescorelm.3.log
run.pl: 3 / 10 failed, log is in exp/chain/tdnn/decode_test_tedlium_rnnlm/log/rescorelm.*.log
steps/nnet3/decode_lookahead.sh --cmd run.pl --nj 10 --online-ivector-dir exp/chain/ivectors_test_tedlium --acwt 1.0 --post-decode-acwt 10.0 exp/chain/tdnn/lgraph data/test_tedlium exp/chain/tdnn/decode_test_tedlium_l
steps/nnet2/check_ivectors_compatible.sh: WARNING: One of the directories do not contain iVector ID.
steps/nnet2/check_ivectors_compatible.sh: WARNING: That means it's you who's reponsible for keeping
steps/nnet2/check_ivectors_compatible.sh: WARNING: the directories compatible
steps/nnet3/decode_lookahead.sh: feature type is raw
steps/diagnostic/analyze_lats.sh --cmd run.pl --iter final exp/chain/tdnn/lgraph exp/chain/tdnn/decode_test_tedlium_l
run.pl: job failed, log is in exp/chain/tdnn/decode_test_tedlium_l/log/analyze_alignments.log
score best paths
local/score.sh --cmd run.pl data/test_tedlium exp/chain/tdnn/lgraph exp/chain/tdnn/decode_test_tedlium_l
local/score.sh: scoring with word insertion penalty=0.0,0.5,1.0
score confidence and timing with sclite
Decoding done.

The text was updated successfully, but these errors were encountered:

nshmyrev · 2024-11-20T19:14:20Z

Killed means not enough memory. Usually we use like 64Gb to compile

samjenks · 2024-11-20T19:40:53Z

I'm definitely below that. before I go buy more RAM are there any compute requirements as well?

nshmyrev · 2024-11-20T21:59:44Z

You can use less jobs for decoding, set -nj 1, it will probably fit. How much RAM do you have?

RNNLM rescoring is not critical either, you can skip it.

samjenks · 2024-11-21T14:46:35Z

16 GBs. What flag turns off RNNLM?

samjenks · 2024-11-21T19:09:56Z

I commented out the RNNLM and set the -nj 1 in the decode.sh script. That got rid of the "Bash Killed" outputs but not the run.pl: Job failed outputs.

Full Output Below:

steps/make_mfcc.sh --nj 1 data/test_tedlium exp/make_mfcc/test mfcc
steps/make_mfcc.sh: moving data/test_tedlium/feats.scp to data/test_tedlium/.backup
utils/validate_data_dir.sh: Successfully validated data-directory data/test_tedlium
steps/make_mfcc.sh [info]: segments file exists: using that.
steps/make_mfcc.sh: Succeeded creating MFCC features for test_tedlium
steps/compute_cmvn_stats.sh data/test_tedlium exp/make_mfcc/test mfcc
Succeeded creating CMVN stats for test_tedlium
fix_data_dir.sh: kept all 1155 utterances.
fix_data_dir.sh: old files are kept in data/test_tedlium/.backup
steps/online/nnet2/extract_ivectors_online.sh --nj 1 --use-vad false data/test_tedlium exp/chain/extractor exp/chain/ivectors_test_tedlium
steps/online/nnet2/extract_ivectors_online.sh: extracting iVectors
steps/online/nnet2/extract_ivectors_online.sh: combining iVectors across jobs
steps/online/nnet2/extract_ivectors_online.sh: done extracting (online) iVectors to exp/chain/ivectors_test_tedlium using the extractor in exp/chain/extractor.
steps/nnet3/decode.sh --cmd run.pl --num-threads 1 --nj 1 --online-ivector-dir exp/chain/ivectors_test_tedlium --acwt 1.0 --post-decode-acwt 10.0 exp/chain/tdnn/graph data/test_tedlium exp/chain/tdnn/decode_test_tedlium
steps/nnet2/check_ivectors_compatible.sh: WARNING: One of the directories do not contain iVector ID.
steps/nnet2/check_ivectors_compatible.sh: WARNING: That means it's you who's reponsible for keeping
steps/nnet2/check_ivectors_compatible.sh: WARNING: the directories compatible
steps/nnet3/decode.sh: feature type is raw
steps/diagnostic/analyze_lats.sh --cmd run.pl --iter final exp/chain/tdnn/graph exp/chain/tdnn/decode_test_tedlium run.pl: job failed, log is in exp/chain/tdnn/decode_test_tedlium/log/analyze_alignments.log
score best paths
local/score.sh --cmd run.pl data/test_tedlium exp/chain/tdnn/graph exp/chain/tdnn/decode_test_tedlium
local/score.sh: scoring with word insertion penalty=0.0,0.5,1.0
score confidence and timing with sclite
Decoding done.
steps/lmrescore_const_arpa.sh data/lang_test data/lang_test_rescore data/test_tedlium exp/chain/tdnn/decode_test_tedlium exp/chain/tdnn/decode_test_tedlium_rescore
steps/lmrescore_const_arpa.sh: Missing file data/lang_test_rescore/G.carpa
steps/nnet3/decode_lookahead.sh --cmd run.pl --nj 1 --online-ivector-dir exp/chain/ivectors_test_tedlium --acwt 1.0 --post-decode-acwt 10.0 exp/chain/tdnn/lgraph data/test_tedlium exp/chain/tdnn/decode_test_tedlium_l
steps/nnet2/check_ivectors_compatible.sh: WARNING: One of the directories do not contain iVector ID.
steps/nnet2/check_ivectors_compatible.sh: WARNING: That means it's you who's reponsible for keeping
steps/nnet2/check_ivectors_compatible.sh: WARNING: the directories compatible
steps/nnet3/decode_lookahead.sh: feature type is raw
steps/diagnostic/analyze_lats.sh --cmd run.pl --iter final exp/chain/tdnn/lgraph exp/chain/tdnn/decode_test_tedlium_l run.pl: job failed, log is in exp/chain/tdnn/decode_test_tedlium_l/log/analyze_alignments.log
score best paths
local/score.sh --cmd run.pl data/test_tedlium exp/chain/tdnn/lgraph exp/chain/tdnn/decode_test_tedlium_l
local/score.sh: scoring with word insertion penalty=0.0,0.5,1.0
score confidence and timing with sclite
Decoding done.

I dug into the log files mentioned, they both are trying to use a 'python' in /usr/bin/env. Which is not where my system python is, how do I point it to a different path?

samjenks closed this as completed Nov 20, 2024

samjenks reopened this Nov 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Errors running decode.sh #1662

Errors running decode.sh #1662

samjenks commented Nov 20, 2024 •

edited

Loading

nshmyrev commented Nov 20, 2024

samjenks commented Nov 20, 2024

nshmyrev commented Nov 20, 2024

samjenks commented Nov 21, 2024

samjenks commented Nov 21, 2024 •

edited

Loading

Errors running decode.sh #1662

Errors running decode.sh #1662

Comments

samjenks commented Nov 20, 2024 • edited Loading

nshmyrev commented Nov 20, 2024

samjenks commented Nov 20, 2024

nshmyrev commented Nov 20, 2024

samjenks commented Nov 21, 2024

samjenks commented Nov 21, 2024 • edited Loading

samjenks commented Nov 20, 2024 •

edited

Loading

samjenks commented Nov 21, 2024 •

edited

Loading