consistent acousticModelName check for g2p

Tagged: AcousticModelEnglish OELanguageModelGenerator g2p

This topic has 3 replies, 2 voices, and was last updated 6 years, 6 months ago by Halle Winkler.

Viewing 4 posts - 1 through 4 (of 4 total)

Advertisement: “Don't want OpenEars™ to guess one of your vocabulary words when it hears an unknown word? Rejecto can help!”

Author

Posts
October 18, 2017 at 11:56 am #1032099

Coeur
Participant

I’m experimenting with many English acoustic models for comparisons.
But currently I’m limited to three bundles because of the restriction from OELanguageModelGenerator.m:

if ([acousticModelName isEqualToString:@”AcousticModelEnglish”] || [acousticModelName isEqualToString:@”AcousticModelAlternateEnglish1″] || [acousticModelName isEqualToString:@”AcousticModelAlternateEnglish2″]) return nil; // These shouldn’t load g2p

Could we allow an arbitrary number of bundles by instead having the same check in OELanguageModelGenerator.m as done in OEGraphemeGenerator.m?

if (([acousticModelName rangeOfString:@”AcousticModelEnglish”].location != NSNotFound) || ([acousticModelName rangeOfString:@”AcousticModelAlternateEnglish”].location != NSNotFound)) return nil; // These shouldn’t load g2p

Like that, I will be able to name my bundles “AcousticModelEnglishAdjusted” or “AcousticModelAlternateEnglish3”, “AcousticModelAlternateEnglish4”, “AcousticModelAlternateEnglish5”, …

Thanks.

October 19, 2017 at 4:57 pm #1032104

Halle Winkler
Politepix

I hear you and I don’t find your request unreasonable. The reason it doesn’t do this is because, in order to support a lot more languages than English and Spanish, it was necessary to take this fairly big step of designing a g2p approach and packaging system that could be used at near-realtime speeds by a phone CPU, and that meant that I couldn’t really continue with the roll-your-own approach to acoustic models that was previously possible because I’m making those g2p files myself with internal tooling. English is an exception because it uses a different g2p method than all the other languages, but I have only had bad experiences with providing support for homemade acoustic models which use the new packaging system. It’s a tradeoff; the (very big) upside to moving away from using the exact Sphinx acoustic model approach is that other languages can now be used with dynamic models and fallback g2p systems and they are performant.

So, in short, the names that are accepted in that method are the names of the models that can be found here which already have the right packaging and whose characteristics I know about, and which I am willing to support. I can’t support other models and I would be hesitant to create a “bring your own model” API because I’m the one who is going to have to play 20 questions when some developers have weird results from making their own models but don’t lead with that information. My recommendation/request in order to keep things manageable for both of us is to either rename your models for the period of your experimentation so that they are allowed through the method, or make your method change locally until you’re satisfied with the results of your experiments. I hope this explanation is helpful to understanding why that method is unexpectedly picky.

October 20, 2017 at 6:29 am #1032105

Coeur
Participant

I understand there is a concern about support. I’ll try to workaround that by keeping the same name but different folders:

/1/AcousticModelEnglish.bundle
/2/AcousticModelEnglish.bundle
/3/AcousticModelEnglish.bundle
etc.

October 20, 2017 at 8:14 am #1032106

Halle Winkler
Politepix

Super, that seems like an ideal solution.
Author

Posts

Viewing 4 posts - 1 through 4 (of 4 total)

The topic ‘consistent acousticModelName check for g2p’ is closed to new replies.