SYNOPSIS

pocketsphinx_batch -hmm hmmdir -dict dictfile [ options ]...

DESCRIPTION

Run speech recognition over a list of utterances in batchmode. A list of arguments follows:

-adcdev

name for audio input (platform-specific)

-adchdr

Size of audio file header in bytes (headers are ignored)

-adcin

Input is raw audio data

-agc

Automatic gain control for c0 ('max', 'emax', 'noise', or 'none')

-agcthresh

Initial threshold for automatic gain control

-allphone

Do phoneme recognition

-alpha

Preemphasis parameter

-backtrace

Print back trace of recognition results

-beam

Beam width applied to every frame in Viterbi search (smaller values mean wider beam)

-bestpath

Run bestpath (Dijkstra) search over word lattice (3rd pass)

-bestpathlw

Language model probability weight for bestpath search

-cachesen

Cache senone scores from first pass search

-cep2spec

Input is cepstral files, output is log spectral files

-cepdir

files directory (prefixed to filespecs in control file)

-cepext

Input files extension (prefixed to filespecs in control file)

-ceplen

Number of components in the input feature vector

-cmn

Cepstral mean normalization scheme ('current', 'prior', or 'none')

-cmninit

Initial values (comma-separated) for cepstral mean when 'prior' is used

-compallsen

Compute all senone scores in every frame (can be faster when there are many senones)

-ctl

file listing utterances to be processed

-ctlcount

No. of utterances to be processed (after skipping -ctloffset entries)

-ctlincr

Do every Nth line in the control file

-ctloffset

No. of utterances at the beginning of -ctl file to be skipped

-dict

pronunciation dictionary (lexicon) input file

-dither

Add 1/2-bit noise

-doublebw

Use double bandwidth filters (same center freq)

-dsratio

Frame GMM computation downsampling ratio

-fbtype

FB Type of mel_scale or log_linear

-fdict

word pronunciation dictionary input file

-feat

Feature stream type, depends on the acoustic model

-fillpen

Filler word transition penalty

-frate

Frame rate

-fsg

state grammar

-fsgbfs

Force backtrace from FSG final state

-fsgctlfn

finite state grammar control file

-fsgusealtpron

Use alternative pronunciations for FSG

-fsgusefiller

(FSG Mode (Mode 2) only) Insert filler words at each state.

-fwd3g

Use trigrams in first pass search

-fwdflat

Run forward flat-lexicon search over word lattice (2nd pass)

-fwdflatbeam

Beam width applied to every frame in second-pass flat search

-fwdflatefwid

Minimum number of end frames for a word to be searched in fwdflat search

-fwdflatlw

Language model probability weight for flat lexicon (2nd pass) decoding

-fwdflatsfwin

Window of frames in lattice to search for successor words in fwdflat search

-fwdflatwbeam

Beam width applied to word exits in second-pass flat search

-fwdtree

Run forward lexicon-tree search (1st pass)

-hmm

containing acoustic model files.

-hyp

output file name

-hypseg

output with segmentation file name

-input_endian

Endianness of input data, big or little, ignored if NIST or MS Wav

-kdmaxbbi

Maximum number of Gaussians per leaf node in kd-Trees

-kdmaxdepth

Maximum depth of kd-Trees to use

-kdtree

file for Gaussian selection

-latsize

Lattice size

-lifter

Length of sin-curve for liftering, or 0 for no liftering.

-live

Get input from audio hardware

-lm

trigram language model input file

-lmctl

a set of language model

The -hmm and -dict arguments are always required. Either -lm or -fsg is required, depending on whether you are using a statistical language model or a finite-state grammar. To do batchmode recognition, you will need to specify a control file, using -ctl This is a simple text file containing one entry per line. Each entry is the name of an input file relative to the -cepdir directory, and without the filename extension (which is given in the -cepext argument).

If you are using acoustic feature files as input (see sphinx_fe(1) for information on how to generate these), you can also specify a subpart of a file, using the following format:

FILENAME START-FRAME END-FRAME UTTERANCE-ID

AUTHOR

Written by numerous people at CMU from 1994 onwards. This manual page by David Huggins-Daines <[email protected]>

COPYRIGHT

Copyright © 1994-2007 Carnegie Mellon University. See the file COPYING included with this package for more information.

RELATED TO pocketsphinx_batch…