edu.cmu.sphinx.linguist.dictionary
Class FastDictionary

java.lang.Object
  extended by edu.cmu.sphinx.linguist.dictionary.FastDictionary
All Implemented Interfaces:
Dictionary, Configurable

public class FastDictionary
extends java.lang.Object
implements Dictionary

Creates a dictionary by quickly reading in an ASCII-based Sphinx-3 format dictionary. It is called the FastDictionary because the loading is fast. When loaded the dictionary just loads each line of the dictionary into the hash table, assuming that most words are not going to be used. Only when a word is actually used is its pronunciations massaged into an array of pronunciations.

The format of the ASCII dictionary that it explains is the same as the FullDictionary, i.e., the word, followed by spaces or tab, followed by the pronunciation(s). For example, a digits dictionary will look like:

  ONE HH W AH N
  ONE(2) W AH N
  TWO T UW
  THREE TH R IY
  FOUR F AO R
  FIVE F AY V
  SIX S IH K S
  SEVEN S EH V AH N
  EIGHT EY T
  NINE N AY N
  ZERO Z IH R OW
  ZERO(2) Z IY R OW
  OH OW
 

In the above example, the words "one" and "zero" have two pronunciations each.


Field Summary
static java.lang.String PROP_ADDENDA
          The name of the SphinxProperty for the custom dictionary file paths.
 
Fields inherited from interface edu.cmu.sphinx.linguist.dictionary.Dictionary
PROP_ADD_SIL_ENDING_PRONUNCIATION, PROP_ALLOW_MISSING_WORDS, PROP_CREATE_MISSING_WORDS, PROP_DICTIONARY, PROP_FILLER_DICTIONARY, PROP_UNIT_MANAGER, PROP_WORD_REPLACEMENT, SENTENCE_END_SPELLING, SENTENCE_START_SPELLING, SILENCE_SPELLING
 
Constructor Summary
FastDictionary()
           
 
Method Summary
 void allocate()
          Allocates the dictionary
 void deallocate()
          Deallocates the dictionary
 void dump()
          Dumps this FastDictionary to System.out.
 java.net.URL getFillerDictionaryFile()
          Get the filler dictionary file
 Word[] getFillerWords()
          Gets the set of all filler words in the dictionary
 WordClassification[] getPossibleWordClassifications()
          Returns the set of all possible word classifications for this dictionary.
 Word getSentenceEndWord()
          Returns the sentence end word.
 Word getSentenceStartWord()
          Returns the sentence start word.
 Word getSilenceWord()
          Returns the silence word.
 Word getWord(java.lang.String text)
          Returns a Word object based on the spelling and its classification.
 java.net.URL getWordDictionaryFile()
          Get the word dictionary file
 void newProperties(PropertySheet ps)
          This method is called when this configurable component needs to be reconfigured.
 java.lang.String toString()
          Returns a string representation of this FastDictionary in alphabetical order.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

PROP_ADDENDA

@S4String(mandatory=false)
public static final java.lang.String PROP_ADDENDA
The name of the SphinxProperty for the custom dictionary file paths. This addenda property points to a possibly empty list of urls to dictionary addenda. Each addendum should contain word pronunciations in the same Sphinx-3 dictionary format as the main dictionary. Words in the addendum are added after the words in the main dictionary and will override previously specified pronunciations. If you wish to extend the set of pronunications for a particular word, add a new pronunciation by number. For example, in the following addendum, given that the aforementioned main dictionary is specified, the pronunciation for 'EIGHT' will be overridden by the addenda, while the pronunciation for 'SIX' and 'ZERO' will be augmented and a new pronunciation for 'ELEVEN' will be added.
          EIGHT   OW T
          SIX(2)  Z IH K S
          ZERO(3)  Z IY Rl AH
          ELEVEN   EH L EH V AH N
 

See Also:
Constant Field Values
Constructor Detail

FastDictionary

public FastDictionary()
Method Detail

newProperties

public void newProperties(PropertySheet ps)
                   throws PropertyException
Description copied from interface: Configurable
This method is called when this configurable component needs to be reconfigured.

Specified by:
newProperties in interface Configurable
Parameters:
ps - a property sheet holding the new data
Throws:
PropertyException - if there is a problem with the properties.

getWordDictionaryFile

public java.net.URL getWordDictionaryFile()
Get the word dictionary file

Returns:
the URL of the word dictionary file

getFillerDictionaryFile

public java.net.URL getFillerDictionaryFile()
Get the filler dictionary file

Returns:
the URL of the filler dictionary file

allocate

public void allocate()
              throws java.io.IOException
Description copied from interface: Dictionary
Allocates the dictionary

Specified by:
allocate in interface Dictionary
Throws:
java.io.IOException - if there is trouble loading the dictionary

deallocate

public void deallocate()
Description copied from interface: Dictionary
Deallocates the dictionary

Specified by:
deallocate in interface Dictionary

getSentenceStartWord

public Word getSentenceStartWord()
Returns the sentence start word.

Specified by:
getSentenceStartWord in interface Dictionary
Returns:
the sentence start word

getSentenceEndWord

public Word getSentenceEndWord()
Returns the sentence end word.

Specified by:
getSentenceEndWord in interface Dictionary
Returns:
the sentence end word

getSilenceWord

public Word getSilenceWord()
Returns the silence word.

Specified by:
getSilenceWord in interface Dictionary
Returns:
the silence word

getWord

public Word getWord(java.lang.String text)
Returns a Word object based on the spelling and its classification. The behavior of this method is also affected by the properties wordReplacement, allowMissingWords, and createMissingWords.

Specified by:
getWord in interface Dictionary
Parameters:
text - the spelling of the word of interest.
Returns:
a Word object
See Also:
Word

getPossibleWordClassifications

public WordClassification[] getPossibleWordClassifications()
Returns the set of all possible word classifications for this dictionary.

Specified by:
getPossibleWordClassifications in interface Dictionary
Returns:
the set of all possible word classifications

toString

public java.lang.String toString()
Returns a string representation of this FastDictionary in alphabetical order.

Overrides:
toString in class java.lang.Object
Returns:
a string representation of this FastDictionary

getFillerWords

public Word[] getFillerWords()
Gets the set of all filler words in the dictionary

Specified by:
getFillerWords in interface Dictionary
Returns:
an array (possibly empty) of all filler words

dump

public void dump()
Dumps this FastDictionary to System.out.