Proper use of OpenEars dictionary

Home Forums OpenEars Proper use of OpenEars dictionary

Viewing 2 posts - 1 through 2 (of 2 total)

  • Author
    Posts
  • #1019189
    daniel
    Participant

    Hey Guys
    We have an online e-commerce app that we want to have speech recognition by OpenEars embedded within.

    Our problem is that we have ~ 14000 words that we need to recognize.

    We saw you recommend to use an about 300 words dictionary.

    Can we extend it ?

    And something else – your examples show that to define the dictionary we need to hard code the words into an array:
    NSArray *words = [NSArray arrayWithObjects:@”WORD”, @”STATEMENT”, @”OTHER WORD”, @”A PHRASE”, nil];

    Is that the only option ?

    Thank you very much

    Daniel

    #1019192
    Halle Winkler
    Politepix

    Welcome Daniel,

    You can also import a vocabulary list from text using the method:

    - (NSError *) generateLanguageModelFromTextFile:(NSString *)pathToTextFile withFilesNamed:(NSString *)fileName forAcousticModelAtPath:(NSString *)acousticModelPath;

    When you have a very large vocabulary that you want to accurately recognize using offline recognition, what you want to think about is how to split it up into multiple smaller language models. If you were showing them a website that had a listing of 14,000 items, you wouldn’t show them all 14,000 items on a single page and ask them to click — instead you would have them navigate a short product hierarchy, i.e. “Technology” then “Smartphones” then “iPhone 5S”. So in the first step, if there are 14 departments, you are already excluding 13,000 items on average. In the second step, if there are 10 subsections, you are excluding another 900 items on average, and you are left with 100 smartphones that the user can view and click on, and they choose the iPhone 5S. For accuracy, you do the same tricks to reduce the search space with speech. You can have them first state a major category, and then use the language model switching to substitute a language model that just has the items from that category. If needed, you could drill down once more, but depending on the items, a vocabulary of 1000 might just work, so I’d do a bit of experimentation and see.

Viewing 2 posts - 1 through 2 (of 2 total)
  • You must be logged in to reply to this topic.