| Author | Posts |
|---|---|
| Author | Posts |
| August 20, 2011 at 3:10 pm #7490 | |
|
vthinkingstudio |
Dear Halle, I am satisfied with the hand made dictionary and language model recognition accuracy. But the accuracy dropped to 20% when I use hub4.5000.dic and hub4.5000.DMP as manual instructed. I wonder if the low recognition accuracy is caused by misusing it. 1, I add hub4.5000.dic and hub4.5000.DMP to project directly, then Update the code. Am I doing this right? 2, the hub4.5000.dic file are all in lower case, the samples are all upper case, is it ok? Plus the dic has over 5000 word, it contains 6400+, it stopped tool: http://www.speech.cs.cmu.edu/tools/lmtool.html working, is there anywhere else provide a dic file with less than 5000 word dictionary? Thanks for your time. vThinking Studio |
| August 20, 2011 at 4:04 pm #7491 | |
|
vthinkingstudio |
Forget issue 2, I figure that out during the further usage. The hub4.5000.dic is generated file, 1400+ expansion is because some word has multiple pronunciation. The lower case dictionary will be convert to upper case after converting. I still wonder if I can get the original 5000 dictionary, so that I can screen out some unwanted word easily. By the way the http://www.speech.cs.cmu.edu/tools/lmtool.html is not working, it return blank page after click compiling. I did not figure out issue 1 by myself, please help me.
|
| August 22, 2011 at 8:49 am #7493 | |
|
Halle |
It is probably due to the language model being a mismatch for your application requirements. |
| August 23, 2011 at 2:02 am #7501 | |
|
vthinkingstudio |
Thanks, I will look into it. |
| August 24, 2011 at 9:44 am #7504 | |
|
Halle |
OK, I also wanted to clarify something about the big language model — I have included it in the instructions because it is very frequently requested and if I don’t offer a pre-configured large LM I am only going to end up answering many questions about where there is a pre-configured large LM and how to add it :) . Many developers want to do local speech recognition of “every word the user says”. However, large vocabulary recognition for dictation on a mobile platform is not really a plug-and-play kind of problem to solve (if it were, there would be no Apple + Nuance story). So, I pretty much always advise OpenEars developers to find a way to use smaller specific models for their requirements, and that’s the main reason why I’ve put a lot of development focus into ARPA model generation capabilities and LM switching. |
| August 24, 2011 at 11:32 am #7506 | |
|
vthinkingstudio |
Thanks for the further explanation. I wonder what’s the threshold of big dictionary? I manage to create a dic and lm about 300, and the result is undesirable. BTW, are you suggesting that this lib works better on desktop? |
| August 24, 2011 at 11:47 am #7508 | |
|
Halle |
I honestly think it is very specific to your application. Test, test, and then do some testing :) .
Effectively. Because if you were running this on the desktop, you could either use Sphinx4 instead or you could use Pocketsphinx with search settings that would absolutely kill performance on the phone, and you could use an acoustic model with much more data in it because there’s no problem having it in memory. On the other hand, on the desktop with the identical library, acoustic model and arguments it will give the same results but much faster. |
You must be logged in to reply to this topic.

OpenEars
Our Flying Friends