HomeForumsOpenEarsInternal error (No such file or directory) on corpus. alternatives?

This topic has 3 voices, contains 8 replies, and was last updated by  sefiroths 259 days ago.

Viewing 9 posts - 1 through 9 (of 9 total)
Author Posts
Author Posts
August 31, 2011 at 3:25 pm #7556

sefiroths

i have readed the tutorial: “Using OpenEars In Your App”
i have created a corpus.txt that has only one word: “KAIOKEN”. i tryid to upload in
http://www.speech.cs.cmu.edu/cgi-bin/tools/lmtool.2.pl
but gives me this error:

Internal error (No such file or directory) on corpus.
Please try again or get in touch with maintainer (air+tools.cmu.edu).
[ /usr/wwwsrv//www-htdocs/tools/test/product/1314799880_14204/1941.corpus ]
Terminating process.

tryd to upload in version 3 (link in the same page)…and as result:
Not Found
The requested URL /tools/product/1314800261_14244/ was not found on this server.

i have read:
http://cmusphinx.sourceforge.net/wiki/tutoriallm
http://cmusphinx.sourceforge.net/wiki/tutorialdict

but when i download version 7 of cmuclmtk gives me 2 exe that is not what used in the examples:
text2wfreq weather.tmp.vocab
sphinx_lm_convert -i weather.arpa -o weather.lm.DMP
………………….
so…for my test project i’d like to recognize only one word: “KAIOKEN”.
is sufficient to use:
- (NSError *) generateLanguageModelFromArray:(NSArray *)languageModelArray withFilesNamed:(NSString *)fileName; ?
or should i make something more?
thanks

August 31, 2011 at 7:48 pm #7557

Halle

Hello,

These are all CMU Sphinx issues other than your question at the very end, so it would be good for you to let them know that the language tool is offline or broken and that the cmuclmtk instructions aren’t working for you.

To answer your question, yes you can use LanguageModelGenerator to create a language model that uses your one word, but it doesn’t look like a word in English or a common name in the English-speaking sphere, which means that neither the CMU tool nor LanguageModelGenerator is likely to give you a correct phonetic dictionary entry for it. You will probably have to tweak it by hand according to the pronunciation you know for it. But LanguageModelGenerator will give you the same quality of language model that the CMU language tool would if it were online.

September 1, 2011 at 8:16 am #7558

sefiroths

there is a tutorial on how to tweak it?
i have seen in sample project that a word “QUIDNUC” could be recognized

September 1, 2011 at 8:23 am #7559

Halle

There is a fallback method of generating pronunciations in LanguageModelGenerator for pronunciations that aren’t in the dictionary, but it is basically guessing and can result in wrong entries for words that don’t follow conventional English pronunciation patterns. I don’t know of a tutorial on how to create phonetic pronunciations but when I need to create a custom pronunciation, I just open up cmu07a.dic and look for other words which have the same sounds in them as the word I need to create an entry for, and use their phonemes. For instance, yesterday I needed to create a dictionary that had “Xcode” in it, so I looked up “x” and “code” and created a new entry in my custom dictionary which used their phonemes together as a single word. Not too hard.

September 1, 2011 at 3:55 pm #7562

sefiroths

so could be something like: K AY OW HH K EH N…
also i’d like to write: K AY OH K EH N, but OH seem not one phonemes existant…
as i’m not english, i don’t know if my pronunce is good, there is a tool that given that phonetic pronunciations reads it?(phonetic pronunciations to speach?)
thanks

  • This reply was modified 260 days ago by  sefiroths.
September 1, 2011 at 4:37 pm #7564

sefiroths

after that i have to make something like:
sphinx_lm_convert -i worldlist.arpa -o wordlist.lm.DMP?
or i’ll use JSGF grammars?
thanks

  • This reply was modified 260 days ago by  sefiroths.
September 1, 2011 at 5:31 pm #7566

Halle

there is a tool that given that phonetic pronunciations reads it?(phonetic pronunciations to speach?)

I don’t know of one offhand, sorry.

after that i have to make something like:
sphinx_lm_convert -i worldlist.arpa -o wordlist.lm.DMP?
or i’ll use JSGF grammars?

How to use a language model with OpenEars is explained in the OpenEars docs, good luck!

September 2, 2011 at 12:09 am #7572

Joseph S. Wisniewski

Are you trying to build a system that only recognizes one word, i.e. a “word spotter” that reacts to that one word while running a continuous recognition loop. If so, I’m pretty sure that will break the language modeling tools, which are going end up dividing by zero when figuring out the probabilities for that word.

In that sample app, “quidnuc” was being recognized along with 7 other words, so the LM can chew on that.

You should be able to do it quite easily through a JSGF grammar. Personally, if it were only one word, I’d post either here or on the sphinx forum to get someone to do a phonetic transcription for you. Here, because I’m in a good mood…

K AY OW K EH N

September 2, 2011 at 11:18 am #7574

sefiroths

thanks for all suggestions
i’ll try!

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic.