Error when using AcousticModelGerman

Home Forums OpenEars Error when using AcousticModelGerman

Viewing 14 posts - 1 through 14 (of 14 total)

  • Author
    Posts
  • #1030275
    konrad
    Participant

    Dear Halle,

    I’m evaluting your OpenEars 2.5 for a diagnosis app for vision-impaired people that we are developing.

    Now for the English Acoustic Model the sample app works fine, I can change words, language model etc.

    When I replace the english model with the german one from https://www.politepix.com/languages I get the following errors during the language model generation:

    ERROR: "dict.c", line 195: Line 1: Phone 'l' is mising in the acoustic model; word 'LINKS' ignored
    ERROR: "dict.c", line 195: Line 2: Phone 'o:' is mising in the acoustic model; word 'OBEN' ignored
    ERROR: "dict.c", line 195: Line 3: Phone 'r' is mising in the acoustic model; word 'RECHTS' ignored
    ERROR: "dict.c", line 195: Line 4: Phone 'uu' is mising in the acoustic model; word 'UNTEN' ignored
    INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones

    Testing was done on an iPhone 6S.

    We’d love to use your framework, if our users could do tests using voice commands it would be a great help.

    #1030276
    Halle Winkler
    Politepix

    Welcome Konrad,

    Thanks for your report, I will check this out next week.

    #1030281
    Halle Winkler
    Politepix

    Hi Konrad,

    I added a test of these words to the German acoustic model testbed and they work (you can also verify that these phonemes are in the acoustic model if you’re very interested, by opening the acoustic model definition file AcousticModelGerman.bundle/mdef in a text editor since the supported phonemes are right near the top of the definition), so we should troubleshoot your implementation a little more and find out why it isn’t working for you.

    Normally a complaint from OEPocketsphinxController about missing phonemes would suggest that either the language model is being generated using the German acoustic model but actual listening is instead being started using the English acoustic model, or that the files within the model have been changed. You can let me know which is more likely and we can start looking into it from there.

    By the way, in case it is helpful to you, it is not necessary to capitalize the words – OpenEars can now handle lowercase or mixed-case as well as uppercase.

    #1030292
    konrad
    Participant

    Thank you for your help :)
    I did indeed miss that the model, when it’s used again lower in the file.
    Maybe you could set it as a variable in the example code, so it’s clearer we use it multiple times?

    Now that german&english recognition is working, it kind of recognizes sound all the time, even very low volume sounds like breathing or a car driving by moves it into recognition mode.
    I assume my use-case will be rather quiet places with users speaking in normal voice, is there some way I can increase the volume threshhold that starts the recognition?

    Even with Rejecto (which I’d be happy to buy) it often doesn’t detect when I say something, since it’s still in the middle of recognizing the ‘mouse-click’ that happened 0.5s before.

    In the languages section you also mention a ‘vadThreshold’ value, but when I search for this in the project nothing comes up.

    Thank you for providing such interesting software, when in absolutely quiet space (me sitting still & not breathing loudly) the english recognition already works 9/10 times.
    Using ‘german’ and only the words ‘open’/’unten’/’links’/’rechts’ It fails to detect 50% of the ‘links’ and often confuses ‘oben’ and ‘unten’. Maybe the language model is not as good?

    #1030294
    Halle Winkler
    Politepix

    Hi Konrad,

    Sure, check out the header or the documentation to learn about setting vadThreshold. It’s the most important adjustment to make when using a non-English model. After you find the ideal level for vadThreshold the next step is potentially to add Rejecto in order to exclude out of vocabulary speech.

    #1030295
    konrad
    Participant

    So, I did find the vadThreshold now, setting it like:
    [[OEPocketsphinxController sharedInstance] setVadThreshold:2.9];

    I guess ‘read the documentation’ would have helped ;)
    This does fix the clicking/breathing problem, but now ‘LEFT’ isn’t often recognized anymore, ‘top’/’bottom’/’right’ still seem ok.

    #1030296
    konrad
    Participant

    PS: I am also using the rejecto demo already.

    #1030297
    Halle Winkler
    Politepix

    I would turn rejecto off while you work with the vadThreshold.

    This does fix the clicking/breathing problem, but now ‘LEFT’ isn’t often recognized anymore, ‘top’/’bottom’/’right’ still seem ok.

    Are you adjusting the vadThreshold for English or German?

    #1030298
    konrad
    Participant

    I tried getting english first, so I first played around with the english vadThreshold value.

    Either it picks up on any noise like mouseclick or my breathing (under vadT <2.5) or when I am above 2.5 it starts to miss the majority of the ‘left’ commands.

    #1030299
    konrad
    Participant

    Oh yeah, turning off rejecto did the trick, now with english and vadThreshold = 2.5 it works pretty well.

    #1030300
    konrad
    Participant

    Without rejecto the german works ok too for vadThreshold = 1.8, as long as one keeps absolutely quiet otherwise.

    #1030301
    Halle Winkler
    Politepix

    Super. Usually the ideal process is 1) check out recognition with your vocabulary (trying to avoid high-confusion word sets that rhyme or are otherwise very similar – ideally they aren’t all one-syllable words), 2) raise the vadThreshold as high as possible to the point that when you speak a vocabulary word under normal environmental conditions, it is recognized, but as little incidental noise and non-speech as possible is heard, and then 3) add Rejecto if needed, starting with a low weight and increasing weight until you have the best out-of-vocabulary word rejection while still having the words in your vocabulary detectable when they are spoken.

    #1030302
    Halle Winkler
    Politepix

    You probably need a higher vadThreshold for German, and then you can add Rejecto if needed once you find the ideal vadThreshold level for it.

    #1030303
    konrad
    Participant

    Thanks for those tips.

    I noticed that ‘Bottom’ is much better recognized then ‘Top’ in english, makes sense if one-syllible words are hard.
    Otherwise the cardinal directions would really suffice, both in german and english. “top”/”left”/”bottom”/”right” are easy to explain, maybe I can try something like ‘above’.

    Will try that later this week and let you know what happens :)

Viewing 14 posts - 1 through 14 (of 14 total)
  • You must be logged in to reply to this topic.