RapidEars ignoring secondsOfSilenceToDetect

Home Forums OpenEars plugins RapidEars ignoring secondsOfSilenceToDetect

Tagged: ,

Viewing 7 posts - 1 through 7 (of 7 total)

  • Author
    Posts
  • #1020739
    morchella
    Participant

    It appears that when setFinalizeHypothesis = FALSE, RapidEars ignores the setting for secondsOfSilenceToDetect. Is this a bug? I don’t need the finalized hypothesis, but I do need to control how long pocketsphinxController waits before considering the utterance ended.

    #1020742
    Halle Winkler
    Politepix

    Hello,

    RapidEars doesn’t use secondsOfSilenceToDetect. Both live and finalized hypotheses are delivered as soon as they are available.

    #1020743
    morchella
    Participant

    Sorry if I was unclear. There is a rapidEarsDidDetectEndOfSpeech delegate method which appears to always be called 50-300ms after pocketsphinxDidDetectFinishedSpeech. I have observed that these end of speech callbacks are sensitive to secondsOfSilenceToDetect, but only when setFinalizeHypothesis = TRUE.

    Is this the intended behavior?

    #1020744
    Halle Winkler
    Politepix

    That’s correct – finalizing the hypothesis means that the end of speech is detected and reported, while using live recognition only means that there is no wait for a pause or consequent callback when a pause is detected because a pause at the end of an utterance isn’t used by the engine for determining the hypothesis.

    #1020745
    morchella
    Participant

    Yes, but even when setFinalizeHypothesis = FALSE, the end of speech is still detected and reported. It’s just that the config option (secondsOfSilenceToDetect) is now ignored.

    I can work around it, but it seems like incorrect behavior.

    #1020749
    Halle Winkler
    Politepix

    I hear what you’re saying, but it is by design. Live mode does not do any kind of waiting for a pause in order to derive state, so if the engine is only operating in live mode, it uses its own logic to determine when is a good time to call an utterance over so that continuous recognition is able to proceed without notable pauses or skips. secondsOfSilenceToDetect and rapidEarsDidDetectEndOfSpeech both refer to pause detection, which isn’t a feature of live mode. It could be documented better, I agree.

    I think that if I wanted to use live mode hypotheses and non-live mode utterance logic I’d probably turn finalize on and just ignore its hyp output. The overhead isn’t that heavy.

    #1020764
    morchella
    Participant

    Okay, thanks Halle, I appreciate the advice!

Viewing 7 posts - 1 through 7 (of 7 total)
  • You must be logged in to reply to this topic.