SaveThatWave: how to avoid truncation

Home Forums OpenEars plugins SaveThatWave: how to avoid truncation

Tagged: 

Viewing 16 posts - 1 through 16 (of 16 total)

  • Author
    Posts
  • #1017100
    gman
    Participant

    Hello,

    I am testing the demo version of SaveThatWave. It works well, but the file are truncated. The last .5 second or so of the utterance is missing from the saved wave. Changing pocketsphinxController.secondsOfSilenceToDetect does not help (not surprisingly).

    I did not fins any other property to adjust in the docs. How can I avoid the issue?

    Thank you.

    #1017101
    Halle Winkler
    Politepix

    Welcome,

    Is this on the Simulator?

    #1017110
    gman
    Participant

    Yes, this is in the simulator. I did not have the opportunity to ty on a device yet. Should I expect a different behavior?

     

    #1017112
    Halle Winkler
    Politepix

    It’s possible. The issue with the Simulator is that it just hosts your own desktop or laptop audio devices. Since it would be a huge, unbounded job for Apple to try to address every possible audio hardware combination in a computer (including external hardware) and really simulate iPhone audio sessions with it, they don’t try — the Simulator doesn’t actually emulate audio session code and that means (for instance) that your audio equipment might even be recording at a different sample rate than an iPhone would, among other possible differences like default buffer size or latency. For those same reasons, I also don’t put a lot of time/energy into trying to make OpenEars and its plugins behave identically on the Simulator as on the device because there will always be a new scenario to fix and it’s work that won’t improve the enduser experience of your OpenEars-enabled app. So OpenEars has a much simpler audio driver on the Simulator and it won’t always work as well as the device driver, which is much more strenuously tested.

    If you’re still seeing this behavior on the device, definitely let me know!

    #1017115
    gman
    Participant

    Thanks for the explanation – makes sense.

    I did a quick test on a device (iPhone 5) and I seem to observe the same behavior on some of the recorded files but not all. This is using a headset with mic to record. I’ll do longer test within a day or two and report back.

     

    #1017116
    Halle Winkler
    Politepix

    OK, if you find truncation on the device let me know about it along with the Xcode version, iOS version, and device so I can replicate (also what kind of headphone we’re talking about if it isn’t the one which ships with the device). It’s also very useful to get some idea of what the input speech was like.

    #1017117
    Halle Winkler
    Politepix

    BTW, is this in a new app made using the tutorial, or is it added to an existing app? It’s possible that other aspects of an existing app can have an effect on the audio session, so let me know if there is anything special about the app.

    #1017119
    gman
    Participant

    I did more testing, and I verify the behavior is identical on devices.

    Here is the setup. I tested on an iPhone 5 and iPad 3, both running IOS 6.1.2. I remove all headsets and used the built-in mic.

    Truncation occurs on both devices. The symptom is that the last 200 to 500ms  are cut out. The input is typically two or three words, and it sounds like the last syllable or so is missing. I tried with longer sentences, and the problem is identical no matter how long the input is. The portion of recording between the start and the truncation sounds fine.

    The truncation occurs on each recording and is perfectly reproducible.

    This is in an app which opens an audio session for playback only. For these tests, I turned off playback, although the audio session is still opened. I have been testing PocketSphynx for some time in this app with audio playback on with problems.

    Aside from being embedded in the app the code is taken directly from the tutorial. It is largely unchanged, except that the model is usually changed between utterances. Whether the model is changed or not does not affect the truncation behavior.

    wavWasSavedAtLocation:(NSString *)location doesn’t do anything else then logging the location.

    pocketsphinxDidReceiveHypothesis calls a new delegate method which triggers an animation if the recognition is successful. Presumably that happens after the recording is save and does not interfere with the saving of the input.

    I tried setting pocketsphinxController.processSpeechLocally = FALSE and interestingly that does not work. wavWasSavedAtLocation does not get invoked anymore, and the log shows a continuous loop of “Pocketsphinx is now listening.” instead of just one when it’s set to true.

    Will purchase if we get it to work, let me know if I can provide more info.

    Thanks!

     

    #1017123
    Halle Winkler
    Politepix

    Can you elaborate on this:

    This is in an app which opens an audio session for playback only

    How does the app set the audio session?

    #1017125
    Halle Winkler
    Politepix

    (To clarify, I’m not disagreeing with or doubting your observation of the behavior, I just have a bit of a backlog so I’m trying to get all my ducks in a row before getting into this issue — it’s already entered as a bug and will get looked at).

    #1017130
    gman
    Participant

    Ok, thank you.

    Apparently I just instantiate an AVAudioPlayer. There was some code to set the category on AVAudioSession sharedInstance] but that’s now commented out.

    Also note that I’m listening to the files after downloading them back to the mac via iTunes.

     

    #1017133
    Halle Winkler
    Politepix

    Thanks for the info (and for commenting out the audio session code — no need to set an audio session for AVAudioPlayer when using it with an OpenEars-enabled app and it can only do harm). Do you also hear the truncation when playing the files in Quicktime?

    #1017134
    Halle Winkler
    Politepix

    Hi gman,

    I’ve had a chance to test this today and with a basic test I couldn’t replicate the truncation issue on the iPhone 6 Simulator or an iPhone 5 running 6.1.3, iPhone tested with built-in mic and with the stock headphones. The audio files which were saved sounded like precise trims to the start and end of speech which I verified by opening them in a spectrum analyzer and making sure there were a few milliseconds of silence-level sound before the cut. For the headphones there was usually a little bit of padding. That doesn’t mean I don’t take the report seriously, just that it doesn’t seem to easily manifest with a 100% stock setup. I’d like to help get to the bottom of the issue you’re seeing.

    Could I invite you to send me your app with which you’re seeing the issue so that I can run it on the same device here and hear what you’re hearing? You can send me a note via the contact form and I’ll respond with the email address you can send it to. Let me know if there is anything interesting about the environment you’re testing in; noise levels, anything special about the device, anything that comes to mind.

    Just to get one possibility out of the way, something I remembered today while testing is that when I was first working on OpenEars I used to play back all of my test recordings in VLC, and I spent days trying to track down a truncation bug of about 200-500ms that was always present in playback even though the speech recognition appeared to be working on the words at the end of the utterance. Eventually I stumbled across this unhappy piece of info:

    https://forum.videolan.org/viewtopic.php?f=12&t=71519
    https://forum.videolan.org/viewtopic.php?f=14&t=71021&p=235453&hilit=cut#p235453
    https://ca.answers.yahoo.com/question/index?qid=20110615061742AAKwbRD

    It is purportedly fixed but seems to regress now and then, so I consider Audacity or Fission to be reliable test platforms but VLC not. I wouldn’t trust iTunes because it is an app which has a few different ways in which it might attempt to interact with audio playback, but I haven’t had any issues personally with playing files directly in Quicktime X for checking playback. I’d also recommend only using the Xcode organizer to download your app file so you can get items out of the documents folder in your own filesystem:

    * In Xcode, shift-command-2
    * select “devices” up top
    * select your device on the left
    * under your device you’ll see “Applications”, select it
    * See a window that allows you to select your test app
    * hit “Download”
    * go to your download location in your own filesystem
    * right-click on the downloaded package
    * select “Show Package Contents”
    * open “AppData” and then “Documents”

    For browsing a Simulator app documents folder, check out my tool for opening a simulator app folder here:

    https://www.politepix.com/2011/05/13/open-the-simulator-sandbox-folder-of-the-app-you-just-built-and-ran/

    #1017168
    gman
    Participant

    Hi Hale,

    Thank you for taking the time to look into this.

    I am using VLC for playback, so that may be the culprit. Let test again and report back, and if still stuck I’m open to sending you the app.

    Thanks!

    #1017169
    gman
    Participant

    Halle,

    It’s VLC. Thanks for pointing this out. I guess I should have tested another player in the first place and saved your time. Sorry about that…

    So it’s working great, I’m going to use the feature and buy the plugin.

    Thanks again.

    #1017172
    Halle Winkler
    Politepix

    Hi gman,

    No problem, same thing happened to me. I think it’s the surprise factor that an OSS audio player of such long standing mangles basic playback — my instinct is usually to expect more veracity and precision from something like VLC than from Quicktime. Thanks for your order!

Viewing 16 posts - 1 through 16 (of 16 total)
  • You must be logged in to reply to this topic.