Reply To: Custom dictionary speech recognition only kind of working.

May 31, 2014 at 10:16 am #1021456

Politepix

This sounds like a bit of a complex issue because of this line:

Usually that means it changes the available vocabulary every few seconds, but occasionally it might fire off two vocabulary changes within a second.

It sounds like there is some kind of event which is external to the progress of the speech recognition that causes a sudden vocabulary change or repeated changes and I can imagine that there could be some unexpected results of two arbitrary changes in very close proximity. Model changing is very fast but it does take a bit of real time and it involves two interdependent files, so I can think of a couple ways this could not work out.

My general advice is to reexamine this design since it’s a little bit at odds with my design assumption that most vocabulary changes will occur as a result of a recognition event or an interaction event driven by the user rather than an external event that could lead to multiple changes inside of a second (doesn’t mean my assumption is correct and your design is incorrect, just that that is the assumption, right or wrong, and the design may work against it and raise more issues than just this one, even if it’s an interesting design).

But we can also try to troubleshoot it to see if it is unrelated to the model switching. I think my first interest in troubleshooting is to take one of the DMP/dic pairs that is giving weird results and testing it alone in the sample app with no switching, so you can verify that it works well in the absence of other issues possibly related to timing. Recognition of individual words can be impaired for a couple of reasons even if they appear in the respective files, in the case that there is some kind of issue with the phonetic transcription or if there is a very similar-sounding other word in the vocabulary.

To look into this further, can you isolate a DMP/.dic pair in which some of the words are understood well and others aren’t which manifests the issue when used as the starting language model for the sample app, so we can examine it? If I’m not mistaken you should also have an .arpa file generated at the same time which will show you the probability model so let’s take a look at that too.