Reply To: Maximum number of words that can be added to the language dictionary

Home Forums OpenEars Maximum number of words that can be added to the language dictionary Reply To: Maximum number of words that can be added to the language dictionary

#1019800
Halle Winkler
Politepix

Hello Hari,

It isn’t the dictionary that defines which words Pocketsphinx is able to recognize. The dictionary is used by the language model or grammar to find the pronunciation of a word which is already part of a language model or grammar. So the idea of using a dictionary with Pocketsphinx or OpenEars’ implementation of Pocketsphinx is not the right one, since the dictionary is not the vocabulary. How this works is going to be important to understand regardless of the size of the vocabulary you choose to go with. The OpenEars docs and the Pocketsphinx docs each cover this so they are worth a look.

Regarding your follow-up question, I did answer it above without any ambiguity. If you want to do large vocabulary recognition with Pocketsphinx, it will have to be done as or via a network service. Even if you do take that approach, there are no pre-rolled large vocabulary sets for Pocketsphinx which cover likely speech for an iPhone user in 2014, so in order to get good accuracy rates, you would have to create your own language model consisting of real phrases your users would say rather than dropping something in which has already been created, since an inappropriate language model will lead to low accuracy.

BTW, you don’t need to add cmu07a.dic to an OpenEars app — it is already in the English acoustic model bundle under the name LanguageModelGeneratorLookupList.text.