How can I increase accuracy?

This topic has 5 replies, 2 voices, and was last updated 9 years, 9 months ago by Halle Winkler.

Viewing 6 posts - 1 through 6 (of 6 total)

Advertisement: “NeatSpeech is great-sounding offline speech synthesis, compatible with iOS6.1, and you can even edit pronunciations!”

Author

Posts
May 9, 2012 at 12:06 pm #9637

oganix
Participant

I’m new to OpenEars and trying to the get numbers 1-100 recognized very well. To be clear, these numbers make up my whole vocabulary. So I have created a language model with the web tool and getting ok results. How can I go about improving the accuracy? Are any of the following can help me with that:

1) Trying to create a better language model by using a different toolkit such as SRILM MITLM or IRSLM
2) Build a acoustic model model with just the numbers 1-10
3) Using LanguageModelGenerator
4) Using JSGF instead ARPA

or is there any other thing I can try?

Thanks in advance

May 9, 2012 at 12:53 pm #9638

Halle Winkler
Politepix

Hi,

Yes, recognizing numbers in isolation seems to be a difficult task for speech recognition engines.

1) Trying to create a better language model by using a different toolkit such as SRILM MITLM or IRSLM

3) Using LanguageModelGenerator

Most language modeling software uses a set or subset of a few existing algorithms, so I don’t think you need to do a lot of experimentation there. The LanguageModelGenerator uses another good package so you could probably just try out whether its output is preferable and then call it a day.

Build a acoustic model model with just the numbers 1-10

Don’t you need 1-100? But you might want to investigate this approach and/or adapting the existing model with your new data: http://cmusphinx.sourceforge.net/wiki/tutorialadapt

It seems like the task of creating an acoustic model that just recognizes 1-100 with a number of different voice contributors and accents is constrained enough to be feasible for an app project.

Using JSGF instead ARPA

In my opinion after some recent experimentation, JSGF is too slow for a good UX. Other developers do use it so as I said this is a matter of opinion. You can use the garbage loop approach for out of vocabulary rejection as well with ARPA as with JSGF: http://sourceforge.net/p/cmusphinx/discussion/help/thread/cefe4df3 which could be something that improves your results if the issue is too many false positives rather than too many false negatives or transposed recognitions.

May 9, 2012 at 12:58 pm #9643

oganix
Participant

Thanks a lot for the quick response. Yes above I meant 1-100 not 1-10

May 9, 2012 at 1:19 pm #9644

Halle Winkler
Politepix

No problem. There is another potential complication that isn’t immediately obvious but that I’ve been trying to make a point of mentioning more frequently here, which is that a lot of developers specify apps with the idea that the device can be pretty far away from the user, but this actually gives the device speech recognition task an additional disadvantage that a desktop speech recognition application would be unlikely to have: a big mismatch between the design of the available microphone and the use that is being made of it. You can even see this with Siri if you open Notes and do dictation from a distance; return time from the server will get slower and accuracy will decrease because the iPhone mic is designed to be spoken directly into and to reject “background noise” which might be your user if they are far enough away and there are competitive sounds.

This isn’t as big a deal with command and control language models/grammars, but as soon as you’re past 20 words or so you can start to see an impact. So another approach is to see if you can educate your users to not put too much distance between themselves and the device during app use.

May 9, 2012 at 1:25 pm #9645

oganix
Participant

Good point. Thanks again.

July 2, 2014 at 9:15 am #1021813

Halle Winkler
Politepix

Just wanted to follow up here that there is now a great method for doing dynamic JSGF grammars built into OpenEars: https://www.politepix.com/2014/04/10/openears-1-7-introducing-dynamic-grammar-generation/
Author

Posts

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic.