Tagged: sound pcm samples ios
September 17, 2015 at 10:53 pm #1026809
Is there a way to get the raw sound samples being played so I can visualise it (maybe even just metering)? This is for the iOS platform in particular. Is there a way I can hack something in on my local build?
Thanks!September 18, 2015 at 8:05 am #1026811
Yes, there is a metering method intended for visual feedback which will handle getting a relative decibel reading with very low latency – check out OEPocketsphinxController’s property pocketsphinxInputLevel. In the sample app which ships with the distribution there is an example of its implementation.September 18, 2015 at 11:00 am #1026813
Hey cool thanks Halle! I can see it now in the example app. I wonder why it needs to be on a thread? I guess you’re directly using the variable from the AuidoUnit callback struct so it needs to lock/unlock to write it from teh voice recog? Anyway – thanks!September 18, 2015 at 11:20 am #1026814
You’re welcome – yup, it’s on a thread so it doesn’t block the UI or the audio driver, but tbh the implementation decision on the level metering method was made over five years ago before I launched the first beta of OpenEars, and unlike most of the API it didn’t get any special scrutiny in the 2.0 update (just because it’s never raised any bugs or issues and its implementation is quite brief), so it could stand for me to reexamine and simplify it.September 18, 2015 at 11:23 am #1026815
Cool – I mean, I can mess around with it now (I actually want to grab the samples and perform an FFT around a small window so I can grab fundamental frequency at time T). The only thing to “fix” is to whack the variable somewhere that I can read without blocking since it will be friendlier to use as an API call. But it’s super low priority (I don’t know if you have some kind of ticketing that I can add this in). So – put it to the back of your mind :)
Thanks!September 18, 2015 at 11:41 am #1026816
Yup, I agree with you that if threading is still necessary, it should be encapsulated by the API just like the rest of the threaded features. The next update is a big one with a lot of coolness and it’s behind schedule so I won’t add any additional features to that one, but probably in a later one. I’ve entered it as a ticket.
For your FFT interests, although that is getting to be more generalized audio toolbox than I want to expose via API, I can give you the pointer that sphinxbase performs its own FFTs so maybe you can save time and effort by piggybacking on their functions since you also get their road-testing as a fringe benefit – you can find the source where they do it in the sphinxbase feature extraction folder:
OpenEarsDistribution/OpenEars/Dependencies/sphinxbase/src/libsphinxbase/feSeptember 18, 2015 at 11:47 am #1026817
Keep in mind that if/when you recompile OpenEars after making changes in its dependencies, you must build using Archive in order to get all architecture slices.September 18, 2015 at 11:48 am #1026818
Thanks!November 5, 2015 at 2:42 pm #1027200November 5, 2015 at 3:04 pm #1027201
To the best of my knowledge the thread is about something which works fine, which is level reading using pocketsphinxInputLevel. We were just discussing that it would be nicer if the backgrounding was done by the API rather than by the implementation as is shown in the sample app.November 5, 2015 at 3:32 pm #1027205
I meant the piggybacking on sphinx base to get data for FFTs. Was just curious but I’ll give it a try in any case.November 5, 2015 at 3:40 pm #1027208
OK, just to clarify, that is going to be a completely unsupported thing since it involves (wildly :) )altering the implementation. That isn’t a reason not to do it, but that wouldn’t be a feature that has applicability to OpenEars’ design goal as a speech interface kit so it is going to remain a fun hack if described here rather than a future feature.November 9, 2015 at 11:53 am #1027241
Of course, I’m still just playing around with this stuff. It would be nice if the API handled the threading for you but it’s not a big deal at all to do it myself.
I think that hacking pocket sphinx to piggyback on their sample data *might* make it possible to detect changes in tone / voice – and this might be a cool thing to have.
CheersDecember 29, 2015 at 9:01 pm #1027642
Having given it a _very_ superficial look for consideration for the upcoming version, it doesn’t actually look to me on first blush like the property gets/sets on the main thread – the reading via a timer in the sample app looks mostly to be regulating the animation (I wrote it long enough ago that I’m forced to speculate a bit :) ).
Are you experiencing blocking when you animate with it that leads to the impression it’s not multithreaded, or is it more due to the outdated comments in the sample app referring to it being a blocking operation? If you have a different kind of animation implementation that gets unexpectedly blocked I’ll be happy to take a closer look at it for a later update, but first let me know if it is actually behaving like it’s blocking mainThread with standard animation approaches so I know if it merits more investigation, thanks.December 31, 2015 at 10:26 am #1027652
It actually seems OK in practice, my assumptions about blocking were due to the comments in the sample app. But in practice as long as I perform the volume check at a suitable point in the render loop it’s mostly fine. I’ve only tested with SceneKit / SpriteKit (metal backed) but I can keep the frame rate pretty consistentDecember 31, 2015 at 11:37 am #1027653
- You must be logged in to reply to this topic.