Reply To: How to get the best results with JSGF

Home Forums OpenEars How to get the best results with JSGF Reply To: How to get the best results with JSGF

Halle Winkler


1. Does the dictionary size matter?

I’m told by the Sphinx project (whose JSGF implementation it is) that it doesn’t. When switching between grammars the dictionary will grow regardless because Sphinx doesn’t have a mechanism for switching between entirely new dictionaries in the current version, meaning that the new words are added to the existing dictionary. I believe that it shouldn’t matter, since the search is constrained to the items in the grammar and dictionary words outside of it shouldn’t be up for consideration in the search even if they appear in the dictionary.

2. If I know that at a given point in time, only 10 words, for example, should be recognized, but there are 250 words total, should I have 25 different gram files and switch between them? Or create one large gram file? It seems, in my case, that smaller gram files produce more false positives.

My expectation would be that it’s better to switch between smaller grammars, but your own testing is the last word. If you are getting less accuracy with smaller grammars, do what gives you more accuracy.

3. Does it help to add similar or dissimilar words to either the dictionary or the gram file to improve accuracy?

Nothing should be in the dictionary that isn’t in the grammar (in the case above with the growing dictionary, it’s unavoidable but it doesn’t provide any particular benefit). In my experience the grammar should just contain the items that are intended to be recognized.

OpenEars supports self-written JSGF, but it isn’t really a topic I give a lot of in-depth support for, because the method for creating grammars in OpenEars is usually its grammar specification language (which can be output by OpenEars to multiple lower-level formats such as JSGF or the RuleORama model type). The advantage of using it is that it supports all of the features Sphinx JSGF supports, but it can be dynamically generated from Cocoa types at runtime and it’s easily human-readable, take a look if you have a moment: