Way to see phonemes OpenEars heard

Home Forums OpenEars Way to see phonemes OpenEars heard

Viewing 4 posts - 1 through 4 (of 4 total)

  • Author
    Posts
  • #7598
    matth
    Participant

    I am using OpenEars in an iOS app to recognize individual numbers (ONE, TWO, THREE, etc.) spoken by children, but I’m getting pretty poor recognition (using OpenEars 0.912, mostly on an iPad). About my only hope of getting this to work better is to put in additional entries in my .dic file for alternate pronunciations of the words (which I’ve done a little) in hopes of “teaching” it how a kid would say those words. However, I really don’t know what to put for the alternate pronunciations.

    Is there any way to see what set of phonemes OpenEars thought it heard? Then I could just show that somewhere as the app is being used and I could see what pronunciations I should add for each of the numbers.

    Thanks in advance for any insights.

    #7599
    Halle Winkler
    Politepix

    Raw phonemes is something that I think only Sphinx 3 does, and IIRC with several caveats. I believe that the task you are doing is known to be very difficult to get good results for. There is an acoustic model called tidgits in [OPENEARS]/CMULibraries/pocketsphinx-0.6.1/model/hmm/en/tidigits with an accompanying language model in [OPENEARS]/CMULibraries/pocketsphinx-0.6.1/model/lm/en that I think is specifically oriented towards recognizing numbers that you might want to try instead of hub4wsj_sc_8k and your custom LM, although I’ve never used it myself so I can’t make any promises.

    #7604

    I do this as a diagnostic technique.

    * Build a dictionary with 40 words, each word being just one of the CMU phonemes
    * Build a language model or FSG where each of the words can follow any other word

    Fair warning, the results will be very strange.

    tidigits is a very “clean” grammar, it works well with fluent speech. And, like most 8k models, it’s most effective on adult males. Kids work best with 16k models (so do women). I’d suggest switching over to VoxForge 0.4, from the CMU site. To make this work well, though, you really need models built from kid speech.

    You could try model adaptation.

    #7606
    Halle Winkler
    Politepix

    The issue with the Voxforge models is that they aren’t license-compatible with App Store distribution.

Viewing 4 posts - 1 through 4 (of 4 total)
  • You must be logged in to reply to this topic.