Tagged: german acousticmodel error
May 7, 2016 at 1:18 pm #1030275
I’m evaluting your OpenEars 2.5 for a diagnosis app for vision-impaired people that we are developing.
Now for the English Acoustic Model the sample app works fine, I can change words, language model etc.
When I replace the english model with the german one from https://www.politepix.com/languages I get the following errors during the language model generation:
ERROR: "dict.c", line 195: Line 1: Phone 'l' is mising in the acoustic model; word 'LINKS' ignored ERROR: "dict.c", line 195: Line 2: Phone 'o:' is mising in the acoustic model; word 'OBEN' ignored ERROR: "dict.c", line 195: Line 3: Phone 'r' is mising in the acoustic model; word 'RECHTS' ignored ERROR: "dict.c", line 195: Line 4: Phone 'uu' is mising in the acoustic model; word 'UNTEN' ignored INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
Testing was done on an iPhone 6S.
We’d love to use your framework, if our users could do tests using voice commands it would be a great help.May 7, 2016 at 5:02 pm #1030276
Thanks for your report, I will check this out next week.May 9, 2016 at 4:48 pm #1030281
I added a test of these words to the German acoustic model testbed and they work (you can also verify that these phonemes are in the acoustic model if you’re very interested, by opening the acoustic model definition file AcousticModelGerman.bundle/mdef in a text editor since the supported phonemes are right near the top of the definition), so we should troubleshoot your implementation a little more and find out why it isn’t working for you.
Normally a complaint from OEPocketsphinxController about missing phonemes would suggest that either the language model is being generated using the German acoustic model but actual listening is instead being started using the English acoustic model, or that the files within the model have been changed. You can let me know which is more likely and we can start looking into it from there.
By the way, in case it is helpful to you, it is not necessary to capitalize the words – OpenEars can now handle lowercase or mixed-case as well as uppercase.May 10, 2016 at 10:07 pm #1030292
Thank you for your help :)
I did indeed miss that the model, when it’s used again lower in the file.
Maybe you could set it as a variable in the example code, so it’s clearer we use it multiple times?
Now that german&english recognition is working, it kind of recognizes sound all the time, even very low volume sounds like breathing or a car driving by moves it into recognition mode.
I assume my use-case will be rather quiet places with users speaking in normal voice, is there some way I can increase the volume threshhold that starts the recognition?
Even with Rejecto (which I’d be happy to buy) it often doesn’t detect when I say something, since it’s still in the middle of recognizing the ‘mouse-click’ that happened 0.5s before.
In the languages section you also mention a ‘vadThreshold’ value, but when I search for this in the project nothing comes up.
Thank you for providing such interesting software, when in absolutely quiet space (me sitting still & not breathing loudly) the english recognition already works 9/10 times.
Using ‘german’ and only the words ‘open’/’unten’/’links’/’rechts’ It fails to detect 50% of the ‘links’ and often confuses ‘oben’ and ‘unten’. Maybe the language model is not as good?May 10, 2016 at 10:28 pm #1030294
Sure, check out the header or the documentation to learn about setting vadThreshold. It’s the most important adjustment to make when using a non-English model. After you find the ideal level for vadThreshold the next step is potentially to add Rejecto in order to exclude out of vocabulary speech.May 10, 2016 at 10:29 pm #1030295
So, I did find the vadThreshold now, setting it like:
[[OEPocketsphinxController sharedInstance] setVadThreshold:2.9];
I guess ‘read the documentation’ would have helped ;)
This does fix the clicking/breathing problem, but now ‘LEFT’ isn’t often recognized anymore, ‘top’/’bottom’/’right’ still seem ok.May 10, 2016 at 10:30 pm #1030296
PS: I am also using the rejecto demo already.May 10, 2016 at 10:33 pm #1030297
I would turn rejecto off while you work with the vadThreshold.
This does fix the clicking/breathing problem, but now ‘LEFT’ isn’t often recognized anymore, ‘top’/’bottom’/’right’ still seem ok.
Are you adjusting the vadThreshold for English or German?May 10, 2016 at 10:35 pm #1030298
I tried getting english first, so I first played around with the english vadThreshold value.
Either it picks up on any noise like mouseclick or my breathing (under vadT <2.5) or when I am above 2.5 it starts to miss the majority of the ‘left’ commands.May 10, 2016 at 10:37 pm #1030299
Oh yeah, turning off rejecto did the trick, now with english and vadThreshold = 2.5 it works pretty well.May 10, 2016 at 10:41 pm #1030300
Without rejecto the german works ok too for vadThreshold = 1.8, as long as one keeps absolutely quiet otherwise.May 10, 2016 at 10:44 pm #1030301
Super. Usually the ideal process is 1) check out recognition with your vocabulary (trying to avoid high-confusion word sets that rhyme or are otherwise very similar – ideally they aren’t all one-syllable words), 2) raise the vadThreshold as high as possible to the point that when you speak a vocabulary word under normal environmental conditions, it is recognized, but as little incidental noise and non-speech as possible is heard, and then 3) add Rejecto if needed, starting with a low weight and increasing weight until you have the best out-of-vocabulary word rejection while still having the words in your vocabulary detectable when they are spoken.May 10, 2016 at 10:45 pm #1030302
You probably need a higher vadThreshold for German, and then you can add Rejecto if needed once you find the ideal vadThreshold level for it.May 10, 2016 at 10:50 pm #1030303
Thanks for those tips.
I noticed that ‘Bottom’ is much better recognized then ‘Top’ in english, makes sense if one-syllible words are hard.
Otherwise the cardinal directions would really suffice, both in german and english. “top”/”left”/”bottom”/”right” are easy to explain, maybe I can try something like ‘above’.
Will try that later this week and let you know what happens :)
- You must be logged in to reply to this topic.