- This topic has 3 replies, 2 voices, and was last updated 8 years, 9 months ago by Halle Winkler.
June 21, 2014 at 7:15 am #1021723cheapdevotionParticipant
In my project, I need the user to be able to give some pretty complex commands based on a number of things. For this reason, I chose to go with a JSGF file. Accuracy with this method is fantastic, but most phrases take around 8-10 seconds for the hypothesis to come back. Are there any tips that could help me improved the performance when using this method? Is there a different approach that I should be using?
Here is a link to my JSGF file so you can see the kind of phrases that I am trying to create.June 21, 2014 at 11:17 am #1021732Halle WinklerPolitepix
Your JSGF looks good, but the decrease in speed is part and parcel of JSGF/FSG on the device. I also think grammars are a good idea for many projects so in order to help with that, I’ve added a couple of major features and a plugin in recent versions, which might get you to the performance you’d like to see.
To start out, I developed a new standard grammar language for stock OpenEars that allows you to create dynamically generated grammars at runtime. You can read about it here:
And here is more about it from CMU speech:
Using this format instead of a single pre-written JSGF file can help you get that speed back by letting you work with multiple smaller grammars that are situationally appropriate depending on the UI branch the user is in. The grammars can be quickly switched between while the recognition loop is running, and generated very quickly from arbitrary input, so you can be very efficient creating the smallest possible grammars (and this will also help recognition). It will also, naturally, make it much faster and simpler to create and edit grammars.
If that approach doesn’t help, the next option is to check out RuleORama:
RuleORama is a paid plugin that uses the same grammar language as stock OpenEars described above, but it creates a sort of pseudogrammar which is much faster to return results, as fast as just using a language model. Even with RuleORama you would probably want to reduce your one large grammar into a few smaller ones that are separated by the phase of the decision tree that the user is in, just because RuleORama currently has some hard limits on complexity.June 21, 2014 at 9:50 pm #1021734cheapdevotionParticipant
Halle – thanks for getting back to me. Because my app is cordova based, I have written plugins for iOS (and pocketsphinx for Android), so I need a way to define my rules in a separate file that both systems can use.
Splitting up the rules a bit has helped, and my idea is to use the Sphinx knowledge base generator to create language model and dictionary files for each type of crew member, and swap them on the fly. I tested this method this morning, and I am getting the following error when trying to swap the files:
Error: couldn’t get attributes of language model file.
If I use the file when I initially call startListeningWithLanguageModelAtPath, it works fine, it just fails with the changeLanguageModelToFile method.
Thanks again!June 21, 2014 at 9:59 pm #1021735Halle WinklerPolitepix
Understood – for the most part I can really only help troubleshoot the grammar and language model output from OpenEars’ generation methods, though. But, if you check out this post it will tell you about turning on OpenEarsLogging and verbosePocketsphinx, which is always a great first step to trying to locate the cause of an error:
If you share the complete logging output here I can take a look and see if there is a straightforward reason for the error.
- You must be logged in to reply to this topic.