Stoccato sound sampling/SaveThatWave shows problem

Home Forums OpenEars plugins Stoccato sound sampling/SaveThatWave shows problem

Viewing 8 posts - 1 through 8 (of 8 total)

  • Author
    Posts
  • #1020619
    hughescr
    Participant

    I’m having some trouble in my OpenEars/SaveThatWave app. I think the same problem is happening when I don’t use SaveThatWave, but SaveThatWave makes it more clear I think what’s going on.

    My app is trying to listen to a sequence of digits [0..9] being spoken. I initialize like this:

    
        LanguageModelGenerator *lmGenerator = [[LanguageModelGenerator alloc] init];
    
        NSString *name = @"NumberNaming";
        NSString *acousticModel = [AcousticModel pathToModel:@"AcousticModelEnglish"];
        NSError *err = [lmGenerator generateLanguageModelFromArray:@[@"ONE",@"TWO",@"THREE",@"FOUR",@"FIVE",@"SIX",@"SEVEN",@"EIGHT",@"NINE",@"ZERO"]
                                                    withFilesNamed:name
                                            forAcousticModelAtPath:acousticModel];
        self.openEarsEventsObserver.delegate = self;
    
        self.pocketsphinxController.audioMode = @"VoiceChat";
        self.pocketsphinxController.calibrationTime = 3;
        self.pocketsphinxController.secondsOfSilenceToDetect = 1.5;
        [self.pocketsphinxController startListeningWithLanguageModelAtPath:[languageGeneratorResults objectForKey:@"LMPath"]
                                                          dictionaryAtPath:[languageGeneratorResults objectForKey:@"DictionaryPath"]
                                                       acousticModelAtPath:acousticModel
                                                       languageModelIsJSGF:NO];
        [self.saveThatWaveController start]; // For saving WAVs from OpenEars
    

    and

    
    - (PocketsphinxController *)pocketsphinxController
    {
        if(pocketsphinxController == nil)
        {
            pocketsphinxController = [[PocketsphinxController alloc] init];
    		pocketsphinxController.outputAudio = TRUE;
        }
    
        return pocketsphinxController;
    }
    
    - (OpenEarsEventsObserver *)openEarsEventsObserver
    {
        if(openEarsEventsObserver == nil)
        {
            openEarsEventsObserver = [[OpenEarsEventsObserver alloc] init];
        }
    
        return openEarsEventsObserver;
    }
    
    - (SaveThatWaveController *)saveThatWaveController
    {
        if(saveThatWaveController == nil)
        {
            saveThatWaveController = [[SaveThatWaveController alloc] init];
        }
    
    	return saveThatWaveController;
    }
    
    

    I use [self.pocketsphinxController suspendRecognition] in pocketsphinxDidStartListening to suspend listening till I’m ready, then [self.pocketsphinxController resumeRecognition] when I want to start listening.

    Now. In the emulator, everything works great. When I run on an actual hardware iPad, the first time I resumeRecognition, OpenEars quickly calls me back on pocketsphinxDidResumeRecognition, but if I start talking, it doesn’t seem to hear anything for a second or two. pocketsphinxDidDetectSpeech isn’t fired and anything I say doesn’t get hypothesized. If I wait silently for a while, then start talking, it hears me, but about 1/2 the time, the audio being recorded is very staccato and recognition is hopeless. Here is a sample recording from SaveTheWave. It sounds like maybe a buffering problem, where it’s got chunks of other parts of the recording stitched in the middle.

    Any thoughts on what might be going on here?

    #1020620
    Halle Winkler
    Politepix

    Welcome,

    I would expect that this is due to the VoiceChat audio mode. The alternate audio modes besides the default are offered without any guarantee of performance since their behaviors are undocumented and change from OS version to OS version and they aren’t part of the testbed. They were added due to several requests, but unfortunately can only be used on an as-is basis and should not be used if you are encountering issues resulting from them.

    #1020621
    hughescr
    Participant

    Disabling the audio mode does make it seem to record & recognize accurately. But now the audio that I’m playing during suspendRecognition using an AVPlayer is way way muted.

    #1020622
    hughescr
    Participant

    Oh, I see looking at the code that you’re using the old deprecated AudioSessionGetProperty() and AudioSessionSetProperty() instead of the AVAudioSession that iOS7 wants; I guess it’s probably not just deprecated in iOS7 but actually just broken too. I’ll see if I can hack something together with AVAudioSession that works.

    #1020623
    Halle Winkler
    Politepix

    This might be a basic limitation of the PlayAndRecord audio session when it isn’t used in combination with the VoiceChat mode (btw, the VoiceChat mode didn’t have this side-effect of changing interaction with media object playback in previous versions of iOS, and could easily cease having it in the future, which is part of the reason I don’t build the framework around special audio mode behaviors) but you can also try to turn on PocketsphinxController’s audioSessionMixing property to see if it is a session mixing issue rather than the somewhat buggy playback issue with VoiceChat.

    #1020624
    Halle Winkler
    Politepix

    Oh, I see looking at the code that you’re using the old deprecated AudioSessionGetProperty() and AudioSessionSetProperty() instead of the AVAudioSession that iOS7 wants; I guess it’s probably not just deprecated in iOS7 but actually just broken too. I’ll see if I can hack something together with AVAudioSession that works.

    AVAudioSession is a higher-level wrapper on the older C-based AudioSession code, but the changes from OS to OS are not due to the API, deprecation, or being broken – there have been behavioral changes in these types of audio behavior in every OS since iOS 3 IIRC. The PlayAndRecord audio session has always had undesirable effects on playback since it was first introduced.

    The VoiceChat-related skipping issue isn’t due to the audio API, BTW. It originates in the Pocketsphinx VAD, which isn’t designed to work with an audio mode that has noise suppression.

    #1020625
    hughescr
    Participant

    I went and re-wrote AudioSessionManager.m using the AVFoundation stuff (and at the same time converted the whole project to ARC if you’re interested in that). Same behavior, so you’re right :) I tried the newer GameChat and VideoChat settings too, and also no luck.

    #1020627
    Halle Winkler
    Politepix

    Yeah, the bigger issue unfortunately is that even if those modes helped your issue, there is no contract for their behavior so it could stop helping on a different device or architecture or after a minor or major OS update. That’s why OpenEars only uses the remoteIO audio unit and defaults to the standard audio mode even though other units and modes occasionally have better performance in various combinations depending on OS version and device.

    The mode property was added after a lot of agitation for it, but I’m pretty likely to remove it in a future update since it has led to a lot of issues and support requests and I don’t have a great explanation for what it’s doing there if it causes problems without bringing positive results other than “people wanted it”; i.e. I made a design mistake.

Viewing 8 posts - 1 through 8 (of 8 total)
  • You must be logged in to reply to this topic.