edu.cmu.sphinx.linguist.language.ngram
Class SimpleNGramModel

java.lang.Object
  extended by edu.cmu.sphinx.linguist.language.ngram.SimpleNGramModel
All Implemented Interfaces:
LanguageModel, Configurable

public class SimpleNGramModel
extends java.lang.Object
implements LanguageModel

An ascii ARPA language model loader. This loader makes no attempt to optimize storage, so it can only load very small language models

Note that all probabilites in the grammar are stored in LogMath log base format. Language Probabilties in the language model file are stored in log 10 base.


Field Summary
static java.lang.String PROP_LOG_MATH
          Sphinx property that defines the logMath component.
 
Fields inherited from interface edu.cmu.sphinx.linguist.language.ngram.LanguageModel
PROP_DICTIONARY, PROP_FORMAT, PROP_FORMAT_DEFAULT, PROP_LOCATION, PROP_LOCATION_DEFAULT, PROP_MAX_DEPTH, PROP_MAX_DEPTH_DEFAULT, PROP_UNIGRAM_WEIGHT, PROP_UNIGRAM_WEIGHT_DEFAULT
 
Constructor Summary
SimpleNGramModel()
           
 
Method Summary
 void allocate()
          Create the language model
 void deallocate()
          Deallocate resources allocated to this language model
 void dump()
          Dumps the language model
 float getBackoff(WordSequence wordSequence)
          Returns the backoff probability for the give sequence of words
 java.util.logging.Logger getLogger()
          Used for reporting errors and warnings during loading
 int getMaxDepth()
          Returns the maximum depth of the language model
 java.lang.String getName()
           
 float getProbability(WordSequence wordSequence)
          Gets the ngram probability of the word sequence represented by the word list
 float getSmear(WordSequence wordSequence)
          Gets the smear term for the given wordSequence
 java.util.Set getVocabulary()
          Returns the set of words in the lanaguage model.
 void newProperties(PropertySheet ps)
          This method is called when this configurable component needs to be reconfigured.
 void start()
          Called before a recognition
 void stop()
          Called after a recognition
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

PROP_LOG_MATH

@S4Component(type=LogMath.class)
public static final java.lang.String PROP_LOG_MATH
Sphinx property that defines the logMath component.

See Also:
Constant Field Values
Constructor Detail

SimpleNGramModel

public SimpleNGramModel()
Method Detail

getLogger

public java.util.logging.Logger getLogger()
Description copied from interface: LanguageModel
Used for reporting errors and warnings during loading

Specified by:
getLogger in interface LanguageModel
Returns:
the logger used by the LanguageModel

newProperties

public void newProperties(PropertySheet ps)
                   throws PropertyException
Description copied from interface: Configurable
This method is called when this configurable component needs to be reconfigured.

Specified by:
newProperties in interface Configurable
Parameters:
ps - a property sheet holding the new data
Throws:
PropertyException - if there is a problem with the properties.

allocate

public void allocate()
              throws java.io.IOException
Description copied from interface: LanguageModel
Create the language model

Specified by:
allocate in interface LanguageModel
Throws:
java.io.IOException

deallocate

public void deallocate()
Description copied from interface: LanguageModel
Deallocate resources allocated to this language model

Specified by:
deallocate in interface LanguageModel

getName

public java.lang.String getName()

start

public void start()
Called before a recognition

Specified by:
start in interface LanguageModel

stop

public void stop()
Called after a recognition

Specified by:
stop in interface LanguageModel

getProbability

public float getProbability(WordSequence wordSequence)
Gets the ngram probability of the word sequence represented by the word list

Specified by:
getProbability in interface LanguageModel
Parameters:
wordSequence - the word sequence
Returns:
the probability of the word sequence. Probability is in logMath log base

getSmear

public float getSmear(WordSequence wordSequence)
Gets the smear term for the given wordSequence

Specified by:
getSmear in interface LanguageModel
Parameters:
wordSequence - the word sequence
Returns:
the smear term associated with this word sequence

getBackoff

public float getBackoff(WordSequence wordSequence)
Returns the backoff probability for the give sequence of words

Parameters:
wordSequence - the sequence of words
Returns:
the backoff probability in LogMath log base

getMaxDepth

public int getMaxDepth()
Returns the maximum depth of the language model

Specified by:
getMaxDepth in interface LanguageModel
Returns:
the maximum depth of the language mdoel

getVocabulary

public java.util.Set getVocabulary()
Returns the set of words in the lanaguage model. The set is unmodifiable.

Specified by:
getVocabulary in interface LanguageModel
Returns:
the unmodifiable set of words

dump

public void dump()
Dumps the language model