Trouble with Spanish Recognition with Test File

Home Forums OpenEars Trouble with Spanish Recognition with Test File

Tagged: , , , ,

Viewing 10 posts - 1 through 10 (of 10 total)

  • Author
    Posts
  • #1026550
    piggins
    Participant

    I’m using a test file for recognition in Spanish, and the recognition, even in complete silence, was being unpredictable. I was initially trying with the Rejecto plugin as well, but I tried disabling that and the recognition was still inaccurate. More often than not, it wouldn’t recognize any words, and when it did, it would only recognize short words such as “la” or “una” instead of all the individual words in a longer phrase. I played with the VAD threshold, trying recognition with values from 0.5 to 5, but got similar results as above. When I tried with the continuous recognition loop, however, it worked great and recognized everything well; the problem only occurred with recognition on a test file. Could you possibly help with this? Thanks!

    #1026554
    Halle Winkler
    Politepix

    Hello,

    Probably the audio file isn’t 16k/16-bit/mono/WAV. Which API are you using to run recognition on your test file??

    #1026562
    piggins
    Participant

    Hi Halle,

    I don’t believe the file format is incorrect because recognition in English on the same type of test file is working properly; only Spanish is giving problems. I’m currently using standard Open Ears recognition with PocketSphinx and no plugins, although I received similar results when trying with the Rejecto plugin as well.

    #1026563
    Halle Winkler
    Politepix

    Which OpenEars method or property are you using to run recognition on your test file? There are two. Keep in mind that the VAD setting for Spanish needs to be much higher, something like 4.3 if I recall correctly, and please make sure you are using v2.041 of OpenEars and 2.04 of any plugins.

    I don’t believe the file format is incorrect

    You can just check what the file format is and see. I think OS X Quicktime has an info pane to check the format.

    #1026564
    piggins
    Participant

    I’m using setPathToTestFile to run recognition on the test file, not runRecognitionOnWavFileAtPath. I tested with the VAD setting at 4.3 but didn’t get any improvements, and I am using the newest versions of OpenEars and the plugins. I also confirmed that the file format is correct. Thanks again for the help and prompt reply.

    #1026571
    Halle Winkler
    Politepix

    OK, sorry, I don’t know what the issue is here – Spanish with setPathToTestFile is part of my testbed and I haven’t seen any issues like this in recent versions of OpenEars. Are you using the most recent version? Are there any errors or warnings in the verbosePocketsphinx/OELogging logs?

    https://www.politepix.com/forums/topic/install-issues-and-their-solutions/

    #1026640
    piggins
    Participant

    Hi Halle,

    I’m not sure if this would affect anything, but I needed functionality such that the user could end speech recognition at will, instead of the normal OpenEars functionality where it will detect when the user has finished speaking. To do this, I used an instance of AVAudioRecorder to record the file and save it to a local directory. When initializing the instance of AVAudioRecorder, I used the following settings:

    NSMutableDictionary *settings = [NSMutableDictionary dictionary];
    [settings setValue: [NSNumber numberWithInt:kAudioFormatLinearPCM] forKey:AVFormatIDKey];
    [settings setValue: [NSNumber numberWithFloat:16000.0] forKey:AVSampleRateKey];
    [settings setValue: [NSNumber numberWithInt:1] forKey:AVNumberOfChannelsKey];
    [settings setValue: [NSNumber numberWithInt:16] forKey:AVLinearPCMBitDepthKey];
    [settings setValue: [NSNumber numberWithInt: AVAudioQualityMax] forKey:AVEncoderAudioQualityKey];
    self.recorder = [[AVAudioRecorder alloc]initWithURL:[NSURL fileURLWithPath:filePath] settings:settings error:&error];

    After recording, I would use setPathToTestFile to test the file I’d just recorded, and then call startListening with the Spanish acoustic model. The majority of the time, no words were recognized, but sometimes very short words would be recognized. Could you tell me if the settings I used are correct to work with setPathToTestFile or if there are any other red flags? I looked through the verbosePocketSphix/OELogging logs and there weren’t any errors or warnings with both English and Spanish recognition, even though the recognition in English worked perfectly but Spanish was less accurate. Would there be any ways, other than recording the file and then testing it, to achieve the desired functionality? Thanks so much for your help Halle.

    #1026641
    Halle Winkler
    Politepix

    Hi,

    Sorry, this usage approach is a little outside what I can support since OpenEars is actively designed not to be a push-to-talk tool and pathToTestFile is for automated testing of the entire audio and recognition harness rather than a WAV recognition method.

    For your own further troubleshooting I would check out if you get better results with recordings made by the SaveThatWave demo, comparing results to see if there is a difference between a SaveThatWave audio file and the ones you are creating, which should give you some hints about what to look into, and whether you get better recognition results when submitting the WAV to runRecognitionOnWavFileAtPath: etc, which is the method designed for doing file-based recognition rather than pathToTestFile which is intended as an automated testing tool.

    #1026644
    piggins
    Participant

    Hi,

    I’ve tested with runRecognitionOnWavFileAtPath and it works much better, but I’m not sure if the Rejecto plugin is working with that method. Do the Open Ears plugins work when using runRecognitionOnWavFileAtPath? In addition, can the VAD threshold be manually set when using the aforementioned method? Thanks.

    #1026646
    Halle Winkler
    Politepix

    Rejecto and RuleORama should work fine with that method. VAD should not be necessary with it since it is for continuous listening, which you are not performing. Give it a try and see what the results are.

Viewing 10 posts - 1 through 10 (of 10 total)
  • You must be logged in to reply to this topic.