Dynamic Grammar Generation

This topic has 5 replies, 2 voices, and was last updated 9 years, 12 months ago by Halle Winkler.

Viewing 6 posts - 1 through 6 (of 6 total)

Advertisement: “Don't want to wait for pauses before receiving speech recognition results? try RapidEars!”

Author

Posts
January 19, 2013 at 7:21 am #1015404

spietari
Participant

Hello,

I first generated in my own code a JSGF file for my grammar with a following result:

#JSGF V1.0; grammar testgrammar; color = WHITE | BLACK | BLUE; public = PANTS COLOR color | SHIRT COLOR color;

Edit: The forum software seems to be removing my color tags…anyway where there’s lowercase color it really means a color word inside brackets.

The nice thing of this format is that I can separate out the repeating word options.

Now, however, I started evaluating the RapidEars plugin and realised it doesn’t seem to work with JSGF grammars. So I switched to LanguageModelGenerator. Now my question is whether it’s better to supply the generateLanguageModelFromArray method with a list of the single words that comprise the sentences and let OpenEars/RapidEars figure out the sentences or should I give it a list of all the possible sentences? Is there a way of obtaining something similar to what I’m doing with JSGF using the LanguageModelGenerator?

Kind regards,
Seppo

January 19, 2013 at 7:27 am #1015407

spietari
Participant

#JSGF V1.0;
grammar testgrammar;
color = WHITE | BLACK | BLUE;
public testgrammar1 = PANTS COLOR color | SHIRT COLOR color;

Also the testgrammar1 tag was removed.

January 19, 2013 at 8:59 am #1015412

Halle Winkler
Politepix

Sorry about the tag removal, I haven’t yet figured out a way to let people paste their whole JSGF grammar without also allowing arbitrary HTML (which is a security issue).

Now my question is whether it’s better to supply the generateLanguageModelFromArray method with a list of the single words that comprise the sentences and let OpenEars/RapidEars figure out the sentences or should I give it a list of all the possible sentences?

I would give the entire sentences; this will cause LanguageModelGenerator to give increased probability to the sentence word sequence.

Is there a way of obtaining something similar to what I’m doing with JSGF using the LanguageModelGenerator?

Not exactly since JSGF and ARPA models address two very different design issues. Unfortunately RapidEars doesn’t currently support JSGF, but lately there has been a lot more usage of JSGF in OpenEars (I think this is because performance on the device has improved to the extent that the performance hit for using JSGF isn’t as arduous as it used to be) so I will give adding it to RapidEars some thought, even though JSGF will definitely be less rapid.

One thing that it occurs to me that you can try, and I guess this is how I would attempt to solve this, would be to change the probabilities in your language model by hand to raise them for bigrams and trigrams which represent the target sentences and lower them for unigrams that represent “loose” words. You’ll need to use the LanguageModelGenerator-produced .arpa file as your language model rather than the .DMP and open it up in a text editor. I would start by only changing the probability of the trigrams (the probability is the value at the start of the line). The probability is from a negative number to zero where zero is neutral and less than zero is lowered probability.

January 19, 2013 at 7:26 pm #1015422

spietari
Participant

Awesome! Thanks a lot. I’ll give it a try. Actually it’s working very well as it is with only single words given to the generator but with such nice library one feels like tweaking it even more!

April 21, 2014 at 3:24 pm #1020912

Halle Winkler
Politepix

Please check out the new dynamic generation language for OpenEars added with version 1.7: https://www.politepix.com/2014/04/10/openears-1-7-introducing-dynamic-grammar-generation/

April 24, 2014 at 6:06 pm #1021023

Halle Winkler
Politepix

In addition to the dynamic grammar generation that has been added to stock OpenEars in version 1.7, there is also a new plugin called RuleORama which can use the same API in order to generate grammars which are a bit faster and compatible with RapidEars: https://www.politepix.com/ruleorama
Author

Posts

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic.