AVFoundation and STT (Speech-To-Text) Recognition

Tagged: AVFoundation, speech to text, STT, Video Recording

This topic has 5 replies, 3 voices, and was last updated 10 years, 7 months ago by salvarez.

Viewing 6 posts - 1 through 6 (of 6 total)

Advertisement: “Don't want to wait for pauses before receiving speech recognition results? try RapidEars!”

Author

Posts
February 22, 2013 at 10:37 am #1015710

marco
Participant

Hi Halle,

I was thinking of making a camera control for video recording when I say “GO” and it would stop recording also when I say “STOP”. I’ve read this post, and he was using UIImagePickerController. I’m using AVCaptureSession. The thing is that when i PocketsphinxController startsListening, the PreviewViewLayer of the AVCaptureSession freezes. the PocketsphinxController works BTW. It’s able to listen to my voice. Is there something i’ve missed or done wrong?

February 26, 2013 at 5:01 pm #1015743

marco
Participant

Hi Halle,

Been looking around the Classes of the Framework. Regarding the AudioSessionManager class, is there a way that it would use a shared AudioSession? Coz i think this will solve the problem i’m facing right now.

February 26, 2013 at 6:41 pm #1015746

Halle Winkler
Politepix

Hi Marco,

In fact, all audio sessions are shared because the audio session is a singleton. OpenEars’ AudioSessionManager is also a singleton. So wherever you address it from, the results will be shared across the app and through the shared audio session because they always go to the same AudioSessionManager and the same audio session. Does that make sense? Or was the question about sharing in a different sense?

February 27, 2013 at 4:31 am #1015750

marco
Participant

I think I’ve found a solution of sort. I removed the audioInput in the AVCaptureSession and that did it. I think the AudioInput of the AVCaptureSession gets “overwritten” by the OpenEars. Still finding another solution though. Thanks Halle. Brilliant work on OpenEars. Been using it since before version 1 got out.

February 27, 2013 at 9:45 am #1015754

Halle Winkler
Politepix

Thanks!

September 7, 2013 at 9:34 pm #1018257

salvarez
Participant

I just started working with OpenEars and am very impressed! It was simple to train and to get up and running.

I want to do a similar thing as Marco and have been digging into the AVFoundation interfaces a little. I was wondering if it would be possible to subclass the AVCaptureDeviceInput class, intercept the audio from the AVCaptureInputClass (see AVCaptureInputPort.input method), Process the STT, and then pass the stream on.

We could then replace session addInput value with [session addInput:<newSubclass>]
Author

Posts

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic.