Tagged: ARAP, grammar, JGSF, speech recognition
- This topic has 10 replies, 2 voices, and was last updated 8 years, 10 months ago by pbtura.
May 9, 2014 at 4:16 pm #1021172
I’m working on a project where I need to use different grammars depending on the context. I wanted to use JGSF rules for commands like navigating around the app and ARPA dictionaries for actually inputting a value. So for example I would use JGSF to listen for the command ‘ENTER NEW COLOR’ and ARPA to select ‘RED, GREEN’,’BLUE’, etc. The problem is, I can’t figure out how to switch between grammars on the fly. If I start the listening loop with JGSF set to true, I get errors anytime I try to switch to an ARPA grammar or vice versa. Is there some way to handle this that I’m not seeing or is that capability just not supported?May 9, 2014 at 4:23 pm #1021173
Sorry, it isn’t possible to switch between JSGF and ARPA in the same listening loop. They have two different kinds of structures that are established at the beginning of the loop and it can’t be changed on the fly without restarting all of the significant features of the loop, so this would require sending stopListening and then starting again with the new format. What is the reason for the requirement to take color input using ARPA instead of switching to a JSGF with RED, GREEN and BLUE?May 9, 2014 at 4:55 pm #1021174
For the input we want to have multiple matches shown to the user. So if it was a list of parts of a room, and the user says ‘DOOR’ and the matcher returns ‘floor’ and ‘door’ we would display both and let them choose. For the commands context that behaviour isn’t desirable. If I say ‘NEW PART’ I want either an exact match or nothing, having it return with ‘BLUE DART’ makes it much harder to react to commands.May 9, 2014 at 5:02 pm #1021175
So if it was a list of parts of a room, and the user says ‘DOOR’ and the matcher returns ‘floor’ and ‘door’ we would display both and let them choose.
To clarify, are you referring to n-best here? It isn’t yet clear to me in what way the engine is returning two words resulting from a single utterance.May 9, 2014 at 5:09 pm #1021176
I’m still new to openears but if this is what you are referring to: https://www.politepix.com/2012/11/08/openears-tips-2-n-best-hypotheses-with-openears/
then yes that is what I want to do.May 9, 2014 at 5:14 pm #1021177
OK, got it – I just wanted to talk it through so that I wasn’t trying to answer a different question due to not quite getting the goal.
If you absolutely require switching between a grammar and an ARPA model the only recommendation I can make is to use RuleORama, since it outputs a DMP which means you can switch between it and your normal ARPA model that you want to use n-best with. Unfortunately it is neither possible to support switching between JSGF and ARPA inline nor to support n-best with JSGF so I don’t see a way to do this precise spec with stock OpenEars.May 9, 2014 at 5:19 pm #1021178
Thanks for the help. Are there any plans to support any of these features in the future? I was really hoping to be able to take advantage of the new JGSF features.May 9, 2014 at 5:37 pm #1021179
No, those are both technically inadvisable features – they aren’t impossible, but one would require a rewrite of part of Pocketsphinx that isn’t on the Sphinx project’s to-do list and the other would require a major rewrite of the listening loop.
I was really hoping to be able to take advantage of the new JGSF features.
JSGF has actually been supported since version 1.0 – the new feature is the mini-language and method for dynamically generating grammars. RuleORama uses the same language and method type, so it doesn’t represent a loss of the ability to use those new features, but I understand if you don’t want to use a plugin to accomplish your task.May 9, 2014 at 6:44 pm #1021180
One last question. I was reading this blog post: https://www.politepix.com/2012/12/04/openears-tips-and-tricks-5-customizing-the-master-phonetic-dictionary-or-using-a-new-one/
I can’t seem to locate the cmu07.dict file that it mentions. Where exactly is this file located at?May 9, 2014 at 6:53 pm #1021181
Correct, that’s a bit out-of-date since the acoustic models are now each wrapped in their own acoustic model bundle. For any acoustic model, the master phonetic lookup .dic file is now called LanguageModelGeneratorLookupList.text and is found in the bundle.May 9, 2014 at 6:58 pm #1021182
- You must be logged in to reply to this topic.