force speech analysis to begin??

This topic has 9 replies, 2 voices, and was last updated 11 years, 7 months ago by sysco.

Viewing 10 posts - 1 through 10 (of 10 total)

Advertisement: “Did you know OpenEars™ can use rules-based grammars to recognize fixed phrases? And RuleORama lets you use them with RapidEars!”

Author

Posts
September 2, 2012 at 2:18 am #10906

sysco
Participant

Is there a way I can force the speech analysis to begin? I would like to ignore the silence level and analyze the audio over a given period of time (from speech began until my NSTimer fires).
Setting secondsOfSilenceToDetect to zero ? Or adjusting the the audio session?

thanks!

September 2, 2012 at 9:18 am #10907

Halle Winkler
Politepix

It’s only possible to do voice audio detection recognition with OpenEars on recordings using its driver. What you could try is to make a WAV recording of the speech and then submit it at the end to the method runRecognitionOnWavFileAtPath:usingLanguageModelAtPath:dictionaryAtPath:languageModelIsJSGF:

September 2, 2012 at 2:35 pm #10908

sysco
Participant

Thanks for your fast reply…

So It’s NOT possible to do something like this.

-startListeningWithLanguageModelAtPath:…..

– (void) pocketsphinxDidStartListening{
// start an NSTImer

}
-(void)timesUp:(NSTimer)timer{

pocketsphinxController.secondsOfSilenceToDetect = 0;
// OR
// mute the audio here…

}

September 2, 2012 at 2:41 pm #10909

Halle Winkler
Politepix

You can’t do that with secondsOfSilenceToDetect. You can fake the first part by immediately suspending listening when listening begins (using the relevant OpenEarsEventsObserver callbacks) and then unsuspending it when you want to begin your arbitrary interval. But there is no way to force recognition/avoid voice audio detection submitting recognition in its own time.

My first suggestion would probably work very similarly to your wish though — instead of starting up recognition and then starting a timer, start a timer that starts an AVAudioRecorder and when your timer runs out, submit the PCM audio to runRecognitionOnWavFileAtPath. It should be functionally the same as what you want as far as I can tell.

September 2, 2012 at 2:42 pm #10910

Halle Winkler
Politepix

(I should say: I don’t expect that there is any way, which doesn’t mean that it’s impossible, just that my educated guess is any workaround will lead to more problems down the road than it solves right now).

September 2, 2012 at 3:09 pm #10911

sysco
Participant

I will experiment with both strategies. I looked at the source briefly and it seems an issue might be disk I/O…. but other that the performance would be the same. Right?
Of course in the end Rapid Ears looks great too. Thanks again….

September 2, 2012 at 3:16 pm #10912

Halle Winkler
Politepix

What specifically do you think would be an I/O issue?

September 2, 2012 at 4:21 pm #10913

sysco
Participant

I was thinking if I called -runRecognitionOnWavFileAtPath: I might have performance issues creating NSData from the file path.

//from ContinuousModel.mm
NSData *originalWavData = [NSData dataWithContentsOfFile:wavPath];

I’m considering using a timer for <5 seconds of audio.

Simply, I want to require the user to say something in a given time interval. I don't want to wait for silence to occur.

Thoughts?

September 2, 2012 at 4:36 pm #10914

Halle Winkler
Politepix

I don’t see that as a big performance issue for audio of that length.

September 2, 2012 at 4:47 pm #10915

sysco
Participant

Great! thanks for your help.
Author

Posts

Viewing 10 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic.