OpenEars-compatible acoustic models

Here are the acoustic models it is possible to use with the OpenEars speech recognition functionality and all the OpenEars plugins other than NeatSpeech.

To use them, download them from this page and then drag the unzipped bundle into your app, adding them to your app target. The default, supported English language acoustic model ships with the OpenEars distribution and doesn’t need to be downloaded from this page or manually added other than via the standard installation process.

Recognition accuracy and processing speed will vary significantly from model to model and support is only provided on a best-effort basis due to this. Some alternate English models have also been added so you can experiment and see if there is one which works best with your usage case.

For good recognition results, please read the note at the end of this page about the vadThreshold setting for non-English acoustic models and 8k mode. TL;DR: you must find a vadThreshold value that works well with the acoustic model, and starting with a value around 3.5 or higher is a good option, and you may get better results from using 8k mode by setting OEPocketsphinxController’s use8kMode property to true before starting recognition. Before setting any value of OEPocketsphinxController, remember to call [[OEPocketsphinxController sharedInstance]setActive:TRUE error:nil] or the Swift equivalent as found in the docs.

Please note it is the responsibility of the app developer to verify the compatibility of the licenses for these acoustic models with her or his project, which are found in the bundles. Politepix offers absolutely no warranty for these models, the nature of their licensing, or their suitability for an app.


Commercial-friendly license models

These models ship with a license and it appears to be a license friendly to commercial usage.

AcousticModelChinese.bundle
AcousticModelFrench.bundle
AcousticModelSpanish.bundle
AcousticModelGerman.bundle


There are also some commercial-friendly alternative English models you can try out if the shipping English model isn’t ideal for your use case:

AcousticModelAlternateEnglish1.bundle
AcousticModelAlternateEnglish2.bundle

GPL models

The GPL is generally understood to be incompatible with the App Store, so the GPL models are provided to support academic research only – Politepix will not provide support to projects which appear to be using GPL models against their license terms.

AcousticModelDutch.bundle
AcousticModelItalian.bundle


There is also a GPL alternative Spanish model:

AcousticModelAlternateSpanish.bundle


Instructions

How to install: unzip, and drag the .bundle file into your project and make sure to check the boxes for the targets you wish to add it to. Then, reference it within your project exactly as you would reference the English model, e.g. change all occurrences of [OEAcousticModel pathToModel:@”AcousticModelEnglish”] to [OEAcousticModel pathToModel:@”AcousticModelChinese”]. For more information on the use of OEAcousticModel pathToModel:, check out the docs, tutorial, and sample app.

Please read! In order to have good recognition results, it is necessary for each non-English model to find its ideal vadThreshold setting via some experimentation, because OpenEars ships with the standard setting for the English model. The vadThreshold setting controls the cutoff level between speech and non-speech when listening, so the wrong setting can result in too much incidental noise being attempted to be processed as speech, or too little real speech being ignored. For this reason, if you don’t test and change the vadThreshold setting to one appropriate for your app, recognition quality will be impaired. A good starting value for a non-English model is 3.5. When experimenting, it’s recommended to increase or decrease vadThreshold only .1 or maybe .5 at a time. Set it a bit higher to reject more unwanted speech and set it lower to process sounds more readily and attempt to detect speech within them. Recognition may also be improved by setting OEPocketsphinxController’s use8kMode setting to true, after activating the shared object and before starting listening.