|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.cmu.sphinx.frontend.BaseDataProcessor
edu.cmu.sphinx.frontend.endpoint.SpeechMarker
public class SpeechMarker
Converts a stream of SpeechClassifiedData objects, marked as speech and non-speech, and mark out the regions that are considered speech. This is done by inserting SPEECH_START and SPEECH_END signals into the stream.
The algorithm for inserting the two signals is as follows.
The algorithm is always in one of two states: 'in-speech' and 'out-of-speech'. If 'out-of-speech', it will read in audio until we hit audio that is speech. If we have read more than 'startSpeech' amount of continuous speech, we consider that speech has started, and insert a SPEECH_START at 'speechLeader' time before speech first started. The state of the algorithm changes to 'in-speech'.
Now consider the case when the algorithm is in 'in-speech' state. If it read an audio that is speech, it is outputted. If the audio is non-speech, we read ahead until we have 'endSilence' amount of continuous non-speech. At the point we consider that speech has ended. A SPEECH_END signal is inserted at 'speechTrailer' time after the first non-speech audio. The algorithm returns to 'ou-of-speech' state. If any speech audio is encountered in-between, the accounting starts all over again.
Field Summary | |
---|---|
static java.lang.String |
PROP_END_SILENCE
The SphinxProperty for the amount of time in silence (in milliseconds) to be considered as utterance end. |
static java.lang.String |
PROP_SPEECH_LEADER
The SphinxProperty for the amount of time (in milliseconds) before speech start to be included as speech data. |
static java.lang.String |
PROP_SPEECH_TRAILER
The SphinxProperty for the amount of time (in milliseconds) after speech ends to be included as speech data. |
static java.lang.String |
PROP_START_SPEECH
The Sphinx4 roperty for the minimum amount of time in speech (in milliseconds) to be considered as utterance start. |
Constructor Summary | |
---|---|
SpeechMarker()
|
Method Summary | |
---|---|
int |
getAudioTime(edu.cmu.sphinx.frontend.endpoint.SpeechClassifiedData audio)
Returns the amount of audio data in milliseconds in the given SpeechClassifiedData object. |
Data |
getData()
Returns the next Data object. |
void |
initialize()
Initializes this SpeechMarker |
void |
newProperties(PropertySheet ps)
This method is called when this configurable component needs to be reconfigured. |
Methods inherited from class edu.cmu.sphinx.frontend.BaseDataProcessor |
---|
getPredecessor, getTimer, setPredecessor, toString |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
@S4Integer(defaultValue=200) public static final java.lang.String PROP_START_SPEECH
@S4Integer(defaultValue=500) public static final java.lang.String PROP_END_SILENCE
@S4Integer(defaultValue=100) public static final java.lang.String PROP_SPEECH_LEADER
@S4Integer(defaultValue=100) public static final java.lang.String PROP_SPEECH_TRAILER
Constructor Detail |
---|
public SpeechMarker()
Method Detail |
---|
public void newProperties(PropertySheet ps) throws PropertyException
Configurable
newProperties
in interface Configurable
newProperties
in class BaseDataProcessor
ps
- a property sheet holding the new data
PropertyException
- if there is a problem with the properties.public void initialize()
initialize
in interface DataProcessor
initialize
in class BaseDataProcessor
public Data getData() throws DataProcessingException
getData
in interface DataProcessor
getData
in class BaseDataProcessor
DataProcessingException
- if a data processing error occurspublic int getAudioTime(edu.cmu.sphinx.frontend.endpoint.SpeechClassifiedData audio)
audio
- the SpeechClassifiedData object
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |