I started off using an ARPA model, but it seemed to require too much perfection to match. I tried to tweak the model by adding similar and dissimilar words (both metaphone2 and levenshtein distance) at various quantities, but my attempts never resulted in consistently better results, such that some scenarios would be good but others would have either too many false positives or false negatives. JSGF, because of the very limited number of words, *seems* to work better in general. Now, it could simply be that I wasn’t optimizing the ARPA model in the right way.
The small grammar that performed the worst was “(AN | ALERT | ANT | ALWAYS | AMAZES)”. “ALERT” would be falsely detected with just background noise given 5-10 seconds of listening. However, adding another 200+ words that start with different letters of the alphabet resulted in significantly better results.
Yes, good clarification on which part of the software generates the grammars and dictionaries. And when I mention performance, the biggest increase in performance was in pre-generating a model with 300-500 words with lmtool versus letting OpenEars do it at runtime. I don’t recall specifically, but it *felt* like it was 2-3 seconds faster on a mobile device (load time and corpus size being positively correlated).