Reply To: Dictionary for OpenEars.

July 2, 2013 at 11:20 am #1017593

Politepix

Hi Priya,

OK, not a problem. The vocabulary is not defined by use of a dictionary file or by changing the path to a dictionary file. The only information inside a dictionary file is how the words are pronounced, which is only half of the required information that PocketsphinxController needs in order to use a specific vocabulary.

Instead a vocabulary is created dynamically by giving an NSArray of words or phrase to the generateLanguageModelFromArray:withFilesNamed: method of LanguageModelGenerator. These output a matched pair of a language model and a dictionary, both of which must be used together and in the specific format output by LanguageModelGenerator. You can see exactly how if you go to the tutorial, select “Offline Speech Recognition”, and exactly follow the instructions under “Using LanguageModelGenerator”.

The second issue is that in offline recognition, your vocabulary will only be accurately perceived if it is between 1-500 words (approximately; actual results will vary and I have seen accurate models around 1000 words and less-accurate models that were very small but consisted of very similar-sounding one-syllable words or words that are very uncommon in English.). So cmu07a.dic is not possible to use with OpenEars because it has about 80,000 words.