API to get Sound Samples

Home Forums OpenEars API to get Sound Samples

Viewing 16 posts - 1 through 16 (of 16 total)

  • Author
    Posts
  • #1026809
    mrfungfung
    Participant

    Hello!

    Is there a way to get the raw sound samples being played so I can visualise it (maybe even just metering)? This is for the iOS platform in particular. Is there a way I can hack something in on my local build?

    Thanks!

    #1026811
    Halle Winkler
    Politepix

    Welcome,

    Yes, there is a metering method intended for visual feedback which will handle getting a relative decibel reading with very low latency – check out OEPocketsphinxController’s property pocketsphinxInputLevel. In the sample app which ships with the distribution there is an example of its implementation.

    #1026813
    mrfungfung
    Participant

    Hey cool thanks Halle! I can see it now in the example app. I wonder why it needs to be on a thread? I guess you’re directly using the variable from the AuidoUnit callback struct so it needs to lock/unlock to write it from teh voice recog? Anyway – thanks!

    #1026814
    Halle Winkler
    Politepix

    You’re welcome – yup, it’s on a thread so it doesn’t block the UI or the audio driver, but tbh the implementation decision on the level metering method was made over five years ago before I launched the first beta of OpenEars, and unlike most of the API it didn’t get any special scrutiny in the 2.0 update (just because it’s never raised any bugs or issues and its implementation is quite brief), so it could stand for me to reexamine and simplify it.

    #1026815
    mrfungfung
    Participant

    Cool – I mean, I can mess around with it now (I actually want to grab the samples and perform an FFT around a small window so I can grab fundamental frequency at time T). The only thing to “fix” is to whack the variable somewhere that I can read without blocking since it will be friendlier to use as an API call. But it’s super low priority (I don’t know if you have some kind of ticketing that I can add this in). So – put it to the back of your mind :)

    Thanks!

    #1026816
    Halle Winkler
    Politepix

    Yup, I agree with you that if threading is still necessary, it should be encapsulated by the API just like the rest of the threaded features. The next update is a big one with a lot of coolness and it’s behind schedule so I won’t add any additional features to that one, but probably in a later one. I’ve entered it as a ticket.

    For your FFT interests, although that is getting to be more generalized audio toolbox than I want to expose via API, I can give you the pointer that sphinxbase performs its own FFTs so maybe you can save time and effort by piggybacking on their functions since you also get their road-testing as a fringe benefit – you can find the source where they do it in the sphinxbase feature extraction folder:

    OpenEarsDistribution/OpenEars/Dependencies/sphinxbase/src/libsphinxbase/fe

    #1026817
    Halle Winkler
    Politepix

    Keep in mind that if/when you recompile OpenEars after making changes in its dependencies, you must build using Archive in order to get all architecture slices.

    #1026818
    mrfungfung
    Participant

    Thanks!

    #1027200
    evilliam
    Participant

    Hey @Halle and @Tak / @mrfungfung – did you get this working?

    #1027201
    Halle Winkler
    Politepix

    Hi,

    To the best of my knowledge the thread is about something which works fine, which is level reading using pocketsphinxInputLevel. We were just discussing that it would be nicer if the backgrounding was done by the API rather than by the implementation as is shown in the sample app.

    #1027205
    evilliam
    Participant

    I meant the piggybacking on sphinx base to get data for FFTs. Was just curious but I’ll give it a try in any case.

    #1027208
    Halle Winkler
    Politepix

    OK, just to clarify, that is going to be a completely unsupported thing since it involves (wildly :) )altering the implementation. That isn’t a reason not to do it, but that wouldn’t be a feature that has applicability to OpenEars’ design goal as a speech interface kit so it is going to remain a fun hack if described here rather than a future feature.

    #1027241
    evilliam
    Participant

    Of course, I’m still just playing around with this stuff. It would be nice if the API handled the threading for you but it’s not a big deal at all to do it myself.

    I think that hacking pocket sphinx to piggyback on their sample data *might* make it possible to detect changes in tone / voice – and this might be a cool thing to have.

    Cheers

    #1027642
    Halle Winkler
    Politepix

    Having given it a _very_ superficial look for consideration for the upcoming version, it doesn’t actually look to me on first blush like the property gets/sets on the main thread – the reading via a timer in the sample app looks mostly to be regulating the animation (I wrote it long enough ago that I’m forced to speculate a bit :) ).

    Are you experiencing blocking when you animate with it that leads to the impression it’s not multithreaded, or is it more due to the outdated comments in the sample app referring to it being a blocking operation? If you have a different kind of animation implementation that gets unexpectedly blocked I’ll be happy to take a closer look at it for a later update, but first let me know if it is actually behaving like it’s blocking mainThread with standard animation approaches so I know if it merits more investigation, thanks.

    #1027652
    evilliam
    Participant

    It actually seems OK in practice, my assumptions about blocking were due to the comments in the sample app. But in practice as long as I perform the volume check at a suitable point in the render loop it’s mostly fine. I’ve only tested with SceneKit / SpriteKit (metal backed) but I can keep the frame rate pretty consistent

    #1027653
    Halle Winkler
    Politepix

    Thanks!

Viewing 16 posts - 1 through 16 (of 16 total)
  • You must be logged in to reply to this topic.