about using a custom dictionary

Home Forums OpenEars about using a custom dictionary

Tagged: , ,

Viewing 4 posts - 1 through 4 (of 4 total)

  • Author
    Posts
  • #1029893
    xwang
    Participant

    Hi Halle,
    I read from following thread,
    https://www.politepix.com/forums/topic/question-about-using-a-custom-dictionary/
    and i want to use my own custom dictionary so i create a acoustic model bundle with only LanguageModelGeneratorLookupList.text in it.

    so in (generateLanguageModelFromArray), i use the path of my own acoustic model bundle(which locate at Caches), and in (startListeningWithLanguageModelAtPath), i use the original bundle path.

    It works perfect before i upgraded to the newest version of Openears(Rapid ear and Rejecto),
    it still works after upgrade, but it will prints following error

    Error: an attempt was made to load the g2p file for the acoustic model at the path /var/mobile/Containers/Data/Application/3FFAA228-0764-48B7-BAF6-42CE5AAB2717/Library/Caches/AcousticModelEnglish1.bundle and it wasn’t possible to complete.

    I don’t know if i need to pay attention to this error, since the engine still works as before.

    BTW: actually, i just create a directory named AcousticModelEnglish1.bundle, and copy LanguageModelGeneratorLookupList.text into it.

    #1029895
    Halle Winkler
    Politepix

    Hello,

    What is the goal of using the custom dictionary file? This may affect my advice a little bit.

    #1029914
    xwang
    Participant

    What we thought is we don’t need to recognize all the words, so we only need some of them in that look up list, thus different users have different words in the list.
    And also in the list that exists words like
    OPENWINDOW OW P AH N W IH N D OW
    which not exists in the original one.

    #1029915
    Halle Winkler
    Politepix

    OK, the reason I ask is that a lot of effort has been taken to make sure the use of the lookup list is extremely fast, so unless you have tested the timing and discovered that there is a big difference between the use of your list and the default list, it usually doesn’t accomplish very much to reduce the list, and of course it can potentially lead to less accuracy to remove words. Adding words (as mentioned in the blog post referenced) is the expected/supported usage, since it has the potential to increase accuracy. To add words, just use the same acoustic model that ships with OpenEars 2.5 and add your words to it in the alphabetically-correct location in its language model lookup list.

    Can I ask the amount of speed improvement in model generation time you saw by using a lookup list with words removed?

    You probably already know this, but to clarify for other readers who find this topic: the lookup lists aren’t the dictionary used during recognition and have no effect at all on recognition speed or accuracy once the model has been generated – the dictionary used during recognition is a newly generated dictionary which is already reduced only to the words needed by the vocabulary. The lookup list is only used very briefly during the generation of the language model.

Viewing 4 posts - 1 through 4 (of 4 total)
  • You must be logged in to reply to this topic.