P9: install qsub
It is possible AWS doesn't support qsub
My. solution to the problem was
decode_cmd="run.pl --max-jobs-run 5"
steps/nnet3/decode.sh --acwt 1.0 --post-decode-acwt 10.0 \
--nj $decode_nj --cmd "$decode_cmd" $iter_opts \
--online-ivector-dir exp/nnet3${nnet3_affix}/ivectors_${decode_set}_hires \
$graph_dir data/${decode_set}_hires $dir/decode_${decode_set}${decode_iter:+_$decode_iter}_tgsmall || exit 1
decode_cmd="run.pl --max-jobs-run 5"
steps/nnet3/decode.sh --acwt 1.0 --post-decode-acwt 10.0 \
--nj $decode_nj --cmd "$decode_cmd" $iter_opts \
--online-ivector-dir exp/nnet3${nnet3_affix}/ivectors_${decode_set}_hires \
$graph_dir data/${decode_set}_hires $dir/decode_${decode_set}${decode_iter:+_$decode_iter}_tgsmall || exit 1
new replaced version:
We need to remove run.pl
original version
# you can change cmd.sh depending on what type of queue you are using.
# If you have no queueing system and want to run on a local machine, you
# can change all instances 'queue.pl' to run.pl (but be careful and run
# commands one by one: most recipes will exhaust the memory on your
# machine). queue.pl works with GridEngine (qsub). slurm.pl works
# with slurm. Different queues are configured differently, with different
# queue names and different ways of specifying things like memory;
# to account for these differences you can create and edit the file
# conf/queue.conf to match your queue's configuration. Search for
# conf/queue.conf in http://kaldi-asr.org/doc/queue.html for more information,
# or search for the string 'default_config' in utils/queue.pl or utils/slurm.pl.
export train_cmd="queue.pl --mem 2G"
export decode_cmd="queue.pl --mem 4G"
export mkgraph_cmd="queue.pl --mem 8G"
new version
# you can change cmd.sh depending on what type of queue you are using.
# If you have no queueing system and want to run on a local machine, you
# can change all instances 'queue.pl' to run.pl (but be careful and run
# commands one by one: most recipes will exhaust the memory on your
# machine). queue.pl works with GridEngine (qsub). slurm.pl works
# with slurm. Different queues are configured differently, with different
# queue names and different ways of specifying things like memory;
# to account for these differences you can create and edit the file
# conf/queue.conf to match your queue's configuration. Search for
# conf/queue.conf in http://kaldi-asr.org/doc/queue.html for more information,
# or search for the string 'default_config' in utils/queue.pl or utils/slurm.pl.
# export train_cmd="queue.pl --mem 2G"
export decode_cmd="run.pl --max-jobs-run 5"
# export mkgraph_cmd="queue.pl --mem 8G"
------------------------------------------------------------------------------------------------------------------------------
Below are just error messages that I encounter. The solution is above.
error
[ec2-user@ip-172-31-6-113 s5]$ sudo yum install gridengine-master gridengine-client
Loaded plugins: extras_suggestions, langpacks, priorities, update-motd
amzn2-core | 3.7 kB 00:00:00
No package gridengine-master available.
No package gridengine-client available.
Error: Nothing to do
[ec2-user@ip-172-31-6-113 s5]$
error
(libri_env) [ec2-user@ip-172-31-6-113 s5]$ ./local/decode_example_v4.sh
steps/online/nnet2/decode.sh --cmd queue.pl --mem 4G --nj 1 exp/chain_cleaned/tdnn_1d_sp/graph_test data/test_clean exp/chain_cleaned/tdnn_1d_sp/decode_test_clean_test
queue.pl: Error submitting jobs to queue (return status was 32512)
queue log file is exp/chain_cleaned/tdnn_1d_sp/decode_test_clean_test/q/decode.log, command was qsub -v PATH -cwd -S /bin/bash -j y -l arch=*64* -o exp/chain_cleaned/tdnn_1d_sp/decode_test_clean_test/q/decode.log -l mem_free=4G,ram_free=4G -t 1:1 /home/ec2-user/kaldi/egs/librispeech/s5/exp/chain_cleaned/tdnn_1d_sp/decode_test_clean_test/q/decode.sh >>exp/chain_cleaned/tdnn_1d_sp/decode_test_clean_test/q/decode.log 2>&1
Output of qsub was: sh: qsub: command not found
error
nano exp/chain_cleaned/tdnn_1d_sp/decode_test_clean_test/q/decode.log
sh: qsub: command not found
Comments
Post a Comment