Error when using AcousticModelGerman

This topic has 13 replies, 2 voices, and was last updated 7 years, 11 months ago by konrad.

Viewing 14 posts - 1 through 14 (of 14 total)

Advertisement: “Rejecto is a plugin for OpenEars™ and RapidEars that lets you ignore speech that isn't in your vocabulary!”

Author

Posts
May 7, 2016 at 1:18 pm #1030275
konrad
Participant
Dear Halle,

I’m evaluting your OpenEars 2.5 for a diagnosis app for vision-impaired people that we are developing.

Now for the English Acoustic Model the sample app works fine, I can change words, language model etc.

When I replace the english model with the german one from https://www.politepix.com/languages I get the following errors during the language model generation:
```
ERROR: "dict.c", line 195: Line 1: Phone 'l' is mising in the acoustic model; word 'LINKS' ignored
ERROR: "dict.c", line 195: Line 2: Phone 'o:' is mising in the acoustic model; word 'OBEN' ignored
ERROR: "dict.c", line 195: Line 3: Phone 'r' is mising in the acoustic model; word 'RECHTS' ignored
ERROR: "dict.c", line 195: Line 4: Phone 'uu' is mising in the acoustic model; word 'UNTEN' ignored
INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
```
Testing was done on an iPhone 6S.

We’d love to use your framework, if our users could do tests using voice commands it would be a great help.
May 7, 2016 at 5:02 pm #1030276

Halle Winkler
Politepix

Welcome Konrad,

Thanks for your report, I will check this out next week.

May 9, 2016 at 4:48 pm #1030281

Halle Winkler
Politepix

Hi Konrad,

I added a test of these words to the German acoustic model testbed and they work (you can also verify that these phonemes are in the acoustic model if you’re very interested, by opening the acoustic model definition file AcousticModelGerman.bundle/mdef in a text editor since the supported phonemes are right near the top of the definition), so we should troubleshoot your implementation a little more and find out why it isn’t working for you.

Normally a complaint from OEPocketsphinxController about missing phonemes would suggest that either the language model is being generated using the German acoustic model but actual listening is instead being started using the English acoustic model, or that the files within the model have been changed. You can let me know which is more likely and we can start looking into it from there.

By the way, in case it is helpful to you, it is not necessary to capitalize the words – OpenEars can now handle lowercase or mixed-case as well as uppercase.

May 10, 2016 at 10:07 pm #1030292

konrad
Participant

Thank you for your help :)
I did indeed miss that the model, when it’s used again lower in the file.
Maybe you could set it as a variable in the example code, so it’s clearer we use it multiple times?

Now that german&english recognition is working, it kind of recognizes sound all the time, even very low volume sounds like breathing or a car driving by moves it into recognition mode.
I assume my use-case will be rather quiet places with users speaking in normal voice, is there some way I can increase the volume threshhold that starts the recognition?

Even with Rejecto (which I’d be happy to buy) it often doesn’t detect when I say something, since it’s still in the middle of recognizing the ‘mouse-click’ that happened 0.5s before.

In the languages section you also mention a ‘vadThreshold’ value, but when I search for this in the project nothing comes up.

Thank you for providing such interesting software, when in absolutely quiet space (me sitting still & not breathing loudly) the english recognition already works 9/10 times.
Using ‘german’ and only the words ‘open’/’unten’/’links’/’rechts’ It fails to detect 50% of the ‘links’ and often confuses ‘oben’ and ‘unten’. Maybe the language model is not as good?

May 10, 2016 at 10:28 pm #1030294

Halle Winkler
Politepix

Hi Konrad,

Sure, check out the header or the documentation to learn about setting vadThreshold. It’s the most important adjustment to make when using a non-English model. After you find the ideal level for vadThreshold the next step is potentially to add Rejecto in order to exclude out of vocabulary speech.

May 10, 2016 at 10:29 pm #1030295

konrad
Participant

So, I did find the vadThreshold now, setting it like:
[[OEPocketsphinxController sharedInstance] setVadThreshold:2.9];

I guess ‘read the documentation’ would have helped ;)
This does fix the clicking/breathing problem, but now ‘LEFT’ isn’t often recognized anymore, ‘top’/’bottom’/’right’ still seem ok.

May 10, 2016 at 10:30 pm #1030296

konrad
Participant

PS: I am also using the rejecto demo already.

May 10, 2016 at 10:33 pm #1030297

Halle Winkler
Politepix

I would turn rejecto off while you work with the vadThreshold.

This does fix the clicking/breathing problem, but now ‘LEFT’ isn’t often recognized anymore, ‘top’/’bottom’/’right’ still seem ok.

Are you adjusting the vadThreshold for English or German?

May 10, 2016 at 10:35 pm #1030298

konrad
Participant

I tried getting english first, so I first played around with the english vadThreshold value.

Either it picks up on any noise like mouseclick or my breathing (under vadT <2.5) or when I am above 2.5 it starts to miss the majority of the ‘left’ commands.

May 10, 2016 at 10:37 pm #1030299

konrad
Participant

Oh yeah, turning off rejecto did the trick, now with english and vadThreshold = 2.5 it works pretty well.

May 10, 2016 at 10:41 pm #1030300

konrad
Participant

Without rejecto the german works ok too for vadThreshold = 1.8, as long as one keeps absolutely quiet otherwise.

May 10, 2016 at 10:44 pm #1030301

Halle Winkler
Politepix

Super. Usually the ideal process is 1) check out recognition with your vocabulary (trying to avoid high-confusion word sets that rhyme or are otherwise very similar – ideally they aren’t all one-syllable words), 2) raise the vadThreshold as high as possible to the point that when you speak a vocabulary word under normal environmental conditions, it is recognized, but as little incidental noise and non-speech as possible is heard, and then 3) add Rejecto if needed, starting with a low weight and increasing weight until you have the best out-of-vocabulary word rejection while still having the words in your vocabulary detectable when they are spoken.

May 10, 2016 at 10:45 pm #1030302

Halle Winkler
Politepix

You probably need a higher vadThreshold for German, and then you can add Rejecto if needed once you find the ideal vadThreshold level for it.

May 10, 2016 at 10:50 pm #1030303

konrad
Participant

Thanks for those tips.

I noticed that ‘Bottom’ is much better recognized then ‘Top’ in english, makes sense if one-syllible words are hard.
Otherwise the cardinal directions would really suffice, both in german and english. “top”/”left”/”bottom”/”right” are easy to explain, maybe I can try something like ‘above’.

Will try that later this week and let you know what happens :)
Author

Posts

Viewing 14 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic.