Reply To: Pocketsphinx 'heard' different to received hypothesis

Home Forums OpenEars Pocketsphinx 'heard' different to received hypothesis Reply To: Pocketsphinx 'heard' different to received hypothesis


thanks for the reply,

from my understanding of OpenEars, i Think that when you speak a sentence the loop detects speech and waits until the end of speech is detected, then it passes the hypothesis to the receiving method. I also think that you can only apply one vocabulary to one phrase at a time, only switching after receiving the hypothesis, then checking for words in the hypothesis to determine what vocabulary change needs to happen.

so it would be like:

phrase is spoken,
hypothesis is received,
vocab to switch to is determined based on hypothesis,
vocab is switched,
user speaks another phrase to be worked on by new vocabulary,
new words are given in hypothesis,
world keeps spinning.

Ive built a test app that works great in this way, but the user is required to speak more than once to get the desired results. what i am trying to do is have it so the user only has to say one phrase which is used to generate multiple hypothesis using various different vocabularies.

pseudo example:

commands = [locate, find, bread, wine, fruit]
breadvocab = [brown, white, wholegrain]
fruitvocab = [bananas, kiwis, pineapples]
winevocab = [red, white, pink]

user says – “locate white wine”

phrase is recorded into wav at wavPath

commands vocab is used for first hypothesis generated from [runRecOnWavAt:wavPath with:commandVocabPath]
hypothesis = “locate wine”

vocab is now changed to winevocab and we [runRecOnWavAt:wavPath with:wineVocabPath]
hypothesis = “white” (or pink white white as that phrase usually generates)

from the one spoken phrase two vocabs were used to find that the user wanted to locate the white wine aisle. In my head this cant be done with the one spoken phrase that the listening loop picks up, because only one vocab is used per phrase/hypothesis? so by using a recorded wav i can check it multiple times with multiple grammars.

Am i missing something? can the listening loop check with multiple grammars without having to use a wav to save the spoken phrase?