Button to enable/disable recording

Home Forums OpenEars Button to enable/disable recording

Viewing 7 posts - 1 through 7 (of 7 total)

  • Author
    Posts
  • #1019882
    andrew
    Participant

    Hi Halle,

    Thank you for your awesome work on OpenEars. It is turning out to be quite a powerful library.

    For our app, we need to allow the user to enable and disable recording by tapping a button. Is there an easy way to achieve this, without having to re-initialize pocketsphinx each time they tap record?

    We want the recognition to start immediately after they tap “Record”, and to stop immediately when they tap “Stop”.

    Cheers!

    #1019888
    Halle Winkler
    Politepix

    Welcome Andrew,

    There is no built-in support for this, however, I think a very easy way to simulate it would be to handle your own start/stop recording with the audio recording API of your choice (that has certainly gotten easier lately), and then simply submit a WAV file to

    - (void) runRecognitionOnWavFileAtPath:(NSString *)wavPath usingLanguageModelAtPath:(NSString *)languageModelPath dictionaryAtPath:(NSString *)dictionaryPath acousticModelAtPath:(NSString *)acousticModelPath languageModelIsJSGF:(BOOL)languageModelIsJSGF
    #1019890
    andrew
    Participant

    Thank you for the fast response!

    That would certainly work. We would like to be able to immediately start live recognition as soon as the user taps the record button. In other words, we would like to replace the voice activity detector with a button.

    Is this possible?
    Any suggestions?

    #1019891
    Halle Winkler
    Politepix

    You’re welcome! That seems possible, see my suggestion above. I think that by using AVAudioRecorder and its prepareToRecord method you could handle all of your own start/stop and WAV-packaging requirements in around 20 lines of code; give its docs a look.

    #1019900
    andrew
    Participant

    The problem with that is the speed. Our user wants to tap and immediately speak and get text immediately. If we record to a file, we can’t start transcription until the file is closed.

    We want to do live (like RapidEars) recognition while the user speaks, but we want to have a button to start and stop the microphone.

    #1019901
    andrew
    Participant

    I guess we can short circuit the render callback to return if the user hasn’t tapped Record. Does that seem like a good option?

    #1019904
    Halle Winkler
    Politepix

    Hmm. If you want to do this with RapidEars, I guess the following is possible:

    1. Suspend immediately once listening has started. The user “start” interaction will cause recognition to resume. This has the same effect as your short-circuiting idea but uses the API.

    2. The user “stop” interaction causes buffers of prerecorded non-speech (it has to be real low-noise-level quiet non-speech recording and not just zeroes, which the VAD will correctly ignore) to be written over the rendered callback buffer for a number of callbacks equal to .7 seconds (that’s about 6 callbacks). This should result in a natural exit from the silence detection loop as the silence makes its way into the VAD. This delay shouldn’t be a big deal for you since live recognition will already have been in progress of being performed and displayed as soon as the mic stream began, so this is just a formality to get the final hypothesis.

    This should work to start and stop in both PocketsphinxController and PocketsphinxController+RapidEars.

Viewing 7 posts - 1 through 7 (of 7 total)
  • You must be logged in to reply to this topic.