Reply To: OpenEars detects multiple words during silence

Home Forums OpenEars OpenEars detects multiple words during silence Reply To: OpenEars detects multiple words during silence

#1019301
Halle Winkler
Politepix

Ah, OK, I think this shouldn’t be too hard to solve. I think your suspicion that it is about when to suspend and resume is correct and this is just about finding the right event for triggering both, and that currently your speech and sounds are being picked up by the recognition.

The way this normally works structurally is that you have control over the moment that you begin the playback of some other sound, so you will suspend right before sending that message. i.e. before you send the message [self.myAudioPlayer play], you will first send the message [self.pocketsphinxController suspendRecognition] so that you know that listening isn’t happening when you start your audio object playing back. This is pretty straightforward because both of these events are under your active control.

Resume is only a bit more complex, because unlike the first case where you are actively saying, or more specifically messaging, “play the audio stuff now/start the speech now”, you have to wait for a callback of some kind to indicate to you that the audio or speech is complete because you don’t control the moment of its completion — the audio or speech object controls that.

When we talk about callbacks we’re basically talking about code which we have to passively wait to inform us that something important happened rather than making it happen ourselves with an active message to an object. You can imagine that you made a phone call to the audio player object and left a message saying “play the audio file, and please call me back once you’ve finished playing it so I can decide what to do next” and that is what is going to happen: instead of you making a second phone call to the audio player and saying “be done with the audio file now”, you are receiving a phone call that says “I’m done with the audio file”.

In Objective-C callbacks take the form of delegate methods: http://stackoverflow.com/questions/1045803/how-does-a-delegate-work-in-objective-c

Very briefly, a delegate method is a method that is called on behalf of some other object, or in this case, very specifically, it is a method that you can implement in your view controller, by making your view controller a delegate of your audio object, in order to know when the audio object is done playing back. Here is a Stack Overflow question that is specifically about implementing the AVAudioPlayer delegate method – (void)audioPlayerDidFinishPlaying:(AVAudioPlayer *)player successfully:(BOOL)flag which tells you when playback is complete:

http://stackoverflow.com/questions/8343402/avaudioplayer-delegate-wont-get-called-in-a-class-method

This will be structurally similar with the TTS methods. Once you have a working callback for knowing when speech or audio is finished, you also have a place to confidently put the resumeRecognition call.