prune_lm.sh

 

When we build the LM using train_lm.sh script with -include_heldout flag, to prune we should use prune_lm.sh.

As it will take advantage of the way the LM was built.

The other way of doing it would be using SRILM but use prune_lm.sh script for best results if you used train_lm.sh to build it.

SRILM way of doing it:

 ngram -lm $lm_dir/lm_4gram.arpa.gz -prune 1.1e-8 -write-lm $lm_dir/lm_pruned_11e8.gz

Kaldi way of doing it that is recommended is

  prune_lm.sh --arpa 120.0 $lm_dir/4gram-mincount

Comments