AVFoundation and STT (Speech-To-Text) Recognition

Home Forums OpenEars AVFoundation and STT (Speech-To-Text) Recognition

Viewing 6 posts - 1 through 6 (of 6 total)

  • Author
    Posts
  • #1015710
    marco
    Participant

    Hi Halle,

    I was thinking of making a camera control for video recording when I say “GO” and it would stop recording also when I say “STOP”. I’ve read this post, and he was using UIImagePickerController. I’m using AVCaptureSession. The thing is that when i PocketsphinxController startsListening, the PreviewViewLayer of the AVCaptureSession freezes. the PocketsphinxController works BTW. It’s able to listen to my voice. Is there something i’ve missed or done wrong?

    #1015743
    marco
    Participant

    Hi Halle,

    Been looking around the Classes of the Framework. Regarding the AudioSessionManager class, is there a way that it would use a shared AudioSession? Coz i think this will solve the problem i’m facing right now.

    #1015746
    Halle Winkler
    Politepix

    Hi Marco,

    In fact, all audio sessions are shared because the audio session is a singleton. OpenEars’ AudioSessionManager is also a singleton. So wherever you address it from, the results will be shared across the app and through the shared audio session because they always go to the same AudioSessionManager and the same audio session. Does that make sense? Or was the question about sharing in a different sense?

    #1015750
    marco
    Participant

    I think I’ve found a solution of sort. I removed the audioInput in the AVCaptureSession and that did it. I think the AudioInput of the AVCaptureSession gets “overwritten” by the OpenEars. Still finding another solution though. Thanks Halle. Brilliant work on OpenEars. Been using it since before version 1 got out.

    #1015754
    Halle Winkler
    Politepix

    Thanks!

    #1018257
    salvarez
    Participant

    I just started working with OpenEars and am very impressed! It was simple to train and to get up and running.

    I want to do a similar thing as Marco and have been digging into the AVFoundation interfaces a little. I was wondering if it would be possible to subclass the AVCaptureDeviceInput class, intercept the audio from the AVCaptureInputClass (see AVCaptureInputPort.input method), Process the STT, and then pass the stream on.

    We could then replace session addInput value with [session addInput:<newSubclass>]

Viewing 6 posts - 1 through 6 (of 6 total)
  • You must be logged in to reply to this topic.