Frequently Asked Questions/Support
If you have trouble with some aspect of using OpenEars™ and you have carefully re-read the documents and examined the example app without it helping, you can ask a question in the OpenEars forum (please turn on OpenEarsLogging and — if the issue relates to recognition — PocketsphinxController’s verbosePocketsphinx property before posting an issue so I have some information to troubleshoot from). The forum is a place to ask questions free of charge, but free private email support is not given for OpenEars. However, you can purchase a support incident or contract if you would like to discuss a question via private email.
Table of Contents
- Frequently Asked Questions/Support
- Q: Do OpenEars and its plugins support Swift?
- Q: Where are the changelogs for OpenEars™ and its plugins?
- Q: How do I update a demo plugin? How do I update a registered plugin?
- Q: I’d like to recognize exact phrases or exact words, or define a rules-based grammar for recognition. Can I do this with OpenEars or the plugins?
- Q: OpenEars recognizes noises or random spoken words as words in my vocabulary, and I want to reduce this.
- Q: I’m trying to use a non-English acoustic model and recognition results are mixed out of the box.
- Q: I followed the tutorial and I’m sure that I did every step, but I’m getting an error similar to ”Slt/Slt.h’ file not found’.
- Q: I just tried the tutorial and OEPocketsphinxController didn’t understand the words that I said.
- Q: But I want to write an app that uses different words from the ones in the sample app.
- Q: I’m trying to use a sound framework like Finch or another OpenAL wrapper and things aren’t working as expected.
- Q: I’m not using an audio framework as in the previous question, but I am getting strange results that aren’t being reported elsewhere.
- Q: I have a bluetooth device that is giving unexpected results.
- Q: When I license an OpenEars plugin (not OpenEars, one of its plugins), the license is for one app. Does that mean I need a license for each app user?
- Q: My app crashes when listening starts
- Q: There is a bug on the Simulator/recognition isn’t good on the Simulator
- Q: If I purchase RapidEars, will OpenEars be able to recognize anything a user says?
- Q: I’m using RapidEars or OpenEars with an acoustic model that I made or downloaded elsewhere and I’m getting the following unexpected results…
- Q: I’m getting a linker error with RapidEars, NeatSpeech, Rejecto, SaveThatWave, or another plugin — what should I do?
- Q: I have tried a fix for a known issue which others have been able to solve definitively, but it doesn’t work for me.
- Q: What license does OpenEars use?
- Q: So I can use this in commercial, closed-source apps?
- Q: Can I or should I reference OpenEars in my support/marketing/etc materials?
- Q: How can I trim down the size of the final binary for distribution?
- Q: But the framework is very large and I don’t want all of that file size added to my app
- Q: But after I added OpenEars my app size got much larger.
- Q: I thought that this version of OpenEars supported the -all_load linker flag, but I’m getting a duplicate symbol error when I use OpenEars with the flag enabled.
- Q: Have any apps ever been rejected for using OpenEars?
- Q: I still have a question, how do I get more support?
- Q: What kinds of questions can I ask in the forums?
- Q: Why was a question, or a reply, or an account removed from the forums?
- Q: Can I hire you to create an OpenEars-enabled app for me or adapt OpenEars, or consult on a speech project using OpenEars?
- Q: Anything else?
Q: Do OpenEars and its plugins support Swift?
A: Yes, according to all available information, OpenEars and its plugins work great in Swift projects via a bridging header and some pointers on that can be found in the forums by searching for the Swift keyword. This project has taken a conservative position on dedicating documentation and forum support resources to Swift 1 & 2 support while their syntax and tooling shook out, only for the practical and unavoidable reason that support resources are a limited supply and most of the shipped apps Politepix gives support for are Objective-C apps, but Politepix is on schedule to officially offer equal support for Swift 3 at Swift 3’s release time and is greatly looking forward to it.
Q: Where are the changelogs for OpenEars™ and its plugins?
A: There is a unified changelog for OpenEars and all of its plugins here, which offers a couple of different options to subscribe so you can always stay up to date.
Q: How do I update a demo plugin? How do I update a registered plugin?
A: To update a demo plugin, download the new version using the link you were originally sent in your demo request email. To update a registered plugin, visit the licensee site at the link sent to you via email when you first purchased your registered plugin.
Q: I’d like to recognize exact phrases or exact words, or define a rules-based grammar for recognition. Can I do this with OpenEars or the plugins?
A: Yes, you can do this with regular OpenEars using the new API for dynamically generating rules-based grammars at runtime (this is the best way to identify fixed phrases with the words in a certain order) and if you need to output grammars which have faster response times than the JSGF response time or which are compatible with RapidEars, you can also try the new plugin RuleORama which uses the same API to output a new format which is as fast to recognize as OpenEars’ language models and compatible with RapidEars.
Q: OpenEars recognizes noises or random spoken words as words in my vocabulary, and I want to reduce this.
A: Rejecto is designed to deal with this issue (it is called the out-of-vocabulary problem) which affects all speech recognition with optimized smaller language models. Before trying Rejecto out, please make sure you aren’t testing on the Simulator since the issue with noises being recognized as speech is much worse on the Simulator than a real device that users will use. Across-the-board noise reduction can be achieved by increasing the value of vadThreshold – please read the next FAQ entry for more on this, even if you are using the English acoustic model.
Q: I’m trying to use a non-English acoustic model and recognition results are mixed out of the box.
In order to have good recognition results, it is necessary for each non-English acoustic model to find its ideal vadThreshold setting via some experimentation, because OpenEars ships with the standard setting for the English model. The vadThreshold setting controls the cutoff level between speech and non-speech when listening, so too low of a vadThreshold value will result in too much incidental noise being attempted to be processed as speech, and with too high of a value real speech can be ignored. For this reason, if you don’t test and change the vadThreshold setting to one appropriate for your app, recognition quality will be impaired. When experimenting, it’s recommended to increase or decrease vadThreshold only .1 or maybe .5 at a time. Set it a bit higher to reject more unwanted speech and set it lower to process sounds more readily and attempt to detect speech within them. Find the right setting here before adding Rejecto to a project, since Rejecto is intended to refine these results. English-language projects will also benefit from some testing to find an ideal vadThreshold level.
Q: I followed the tutorial and I’m sure that I did every step, but I’m getting an error similar to ”Slt/Slt.h’ file not found’.
A: If you are using Xcode 5 with a build number of 5A1413 or later, it has a bug which results in frameworks linked by reference being changed to link at incorrect URL paths, so when you add the frameworks it is necessary for you to also check the box that says “Copy items into destination group’s folder (if needed)”, or you may receive errors that header files can’t be found in frameworks which were already added. If the issues persist, take a look at what is found in your Framework Search Paths build setting for the app, since it is this entry that is being changed into a non-working URL when frameworks are added. I hope this bug is fixed soon.
Q: I just tried the tutorial and OEPocketsphinxController didn’t understand the words that I said.
A: 95% of the time, this is either because you were saying words which aren’t in the vocabulary that OEPocketsphinxController is listening for, meaning that it doesn’t have a way of recognizing those words, or you are testing recognition on the Simulator. Take a look at which words the app is listening for and test recognition of those words, and make sure to test on a real device. It can also be very damaging to recognition accuracy to have a misspelled word in your vocabulary array, since OELanguageModelGenerator will not be able to successfully look it up in the pronunciation dictionary and will have to make its best guess, which may be different from what you or a user is saying to the device.
Q: But I want to write an app that uses different words from the ones in the sample app.
A: OELanguageModelGenerator is the class in OpenEars which lets you define which words to listen for. OpenEars works by creating a specific vocabulary to listen for. The tutorial explains how to create your own vocabulary and there are also examples of creating custom vocabularies in the sample app.
Q: I’m trying to use a sound framework like Finch or another OpenAL wrapper and things aren’t working as expected.
A: OEPocketsphinxController has very specific audio session and audio unit requirements and it can’t be run simultaneously with another framework which requires control over the audio session and audio input. I’m not aware of any other frameworks with specific audio session requirements which are able to fulfill their audio functionality while another audio framework with different session requirements is running simultaneously.
Q: I’m not using an audio framework as in the previous question, but I am getting strange results that aren’t being reported elsewhere.
A: This is likely to be for the same reasons as described in the previous section – something in the app is making changes to the audio session or audio settings in a way that conflicts with the settings that OpenEars needs to be able to rely on and manage itself. This often happens due to using a framework where it isn’t obvious that it makes audio session changes (I believe Unity does this, as do other speech recognition SDKs) and it can also happen as a result of copy/pasting Cocoa audio SDK code which includes an extraneous AVAudioSession call (for whatever reason, there was a fad for Stack Overflow answers about audio code to start with AVAudioSession calls without a clear need for it so there is a misconception that they should prepend any audio code). If you are getting unusual audio results, the most important troubleshooting step is to search your app for any calls to AVAudioSession or low-level “audiosession” and turn them off, and to remove third-party SDKs that have any relationship to audio processing (i.e. another speech SDK, a game development platform, etc). Even if the code you are turning off is needed by your app, it is important to know when seeking support here that you are seeking help with an audio coexistence conflict, and not to report the behavior as a bug without sharing the important information that it is happening as a result of an audio coexistence conflict that is not expected to work.
Q: I have a bluetooth device that is giving unexpected results.
A: Unfortunately, different bluetooth devices do seem to have different levels of compatibility with Cocoa audio APIs, although it is meant to be a standard. Some devices aren’t compatible with non-Apple apps at all, while others can do playback with 3rd-party apps but not recording; this may come down to buffering behavior in the hardware. This is the entire reason that bluetooth support in OpenEars is marked as experimental.
In order to first test whether there should be any expectation of your device working with OpenEars, verify that it can do low-latency recording with any third party app – the best way to check this is to see whether it works with a non-Apple app that does realtime voice chat or voice calls. If you are sure that low-latency recording works with some apps, it isn’t possible to do troubleshooting of issues via the forums since it involves unknown hardware, but you can give Politepix one of the devices (either your own example of the device sent by post, or a new or used one via Amazon – you can use the contact form for more info), and Politepix will take a look at whether there is something straightforward that can be done in order to expand support (this is unfortunately neither a commitment to support the device, nor a commitment to invest a long period of troubleshooting in the specific device, but it will definitely be looked at and tested with the goal of finding out why it doesn’t work or getting it to work if that is straightforward).
Given the huge variety of devices, their different behavior, their expense, and the fact that previously verified-to-be-working hardware may change behavior with different hardware or software versions or iOS version changes, it is not prudent for Politepix to attempt to maintain its own testbed of bluetooth devices, and it is unfortunately also not possible to commit to in-depth troubleshooting with the goal of supporting every device or any one device. This offer is based on time availability so it may not always be possible to undertake, and is intended to refer to a single hardware example at a time from an independent app developer/producer/company which makes apps.
Q: When I license an OpenEars plugin (not OpenEars, one of its plugins), the license is for one app. Does that mean I need a license for each app user?
A: No, the license is for the app itself, so you need one license for one listing in the App Store. No matter how many users your app gets, it’s just one license needed, and Politepix hopes you get a whole lot.
Q: My app crashes when listening starts
A: The exact reason for this will always be reported by logging once you turn on OELogging and verbosePocketsphinx. Both are described in the documentation. If the logging doesn’t explain it clearly enough to fix, it is fine to show the complete logging in a question in the forums and ask for help.
Q: There is a bug on the Simulator/recognition isn’t good on the Simulator
A: OpenEars has a low-latency audio driver written using the Audio Unit API which requires an Audio Session setting in order to work, and it isn’t supported by the Simulator. Because it can be slow to debug app logic without using the Simulator, OpenEars has a fallback audio approach that is compatible with the Simulator. However, it isn’t as good as the device approach and very little time has been spent trying to debug it since it is only provided as a nicety. With that understanding, please don’t evaluate OpenEars’ accuracy or behavior based on the Simulator, since it uses a completely different audio driver, and please don’t report Simulator-only bugs since there’s no way to fairly allocate resources towards fixing Simulator-only audio issues when no users run apps on the Simulator.
Q: If I purchase RapidEars, will OpenEars be able to recognize anything a user says?
A: No, RapidEars does exactly the same small-vocabulary offline recognition that OpenEars does, but it does it in realtime on speech that is still in-progress rather than having to wait for the user to pause for a second before beginning to do recognition. That’s pretty cool, actually! Both OpenEars and RapidEars are recommended for use with vocabularies that are smaller than 1000 words.
Q: I’m using RapidEars or OpenEars with an acoustic model that I made or downloaded elsewhere and I’m getting the following unexpected results…
A: Politepix can only support the acoustic models that it ships, since it can only test against these models.
A: It is necessary to set the linker flag -ObjC for your target when using the plugins, and it is equally necessary that the linker flag -all_load is not set or found anywhere in your project (whether using the plugins or not). If this isn’t the issue, it is otherwise always due to the fact that the plugin requires a certain version of OpenEars or later, and you are using an old version of OpenEars, or an earlier version is still somehow linked to your project, so update to the current version of OpenEars. In the case of NeatSpeech, it is also necessary to give extra attention to this step from the instructions: “For the last step, change the name of the implementation source file in which you are going to call NeatSpeech methods from .m to .mm (for instance, if the implementation is named ViewController.m, change its name to ViewController.mm and verify in the Finder that the name of the file has changed) and then make sure that in your target Build Settings, under the section “C++ Standard Library”, the setting “libstdc++ (Gnu C++ standard library)” is selected. If you receive errors like “Undefined symbols for architecture i386: std::basic_ios >::widen(char) const”, that means that this step needs special attention.”
Q: I have tried a fix for a known issue which others have been able to solve definitively, but it doesn’t work for me.
A: This is often solved by cleaning your project before testing again.
Q: What license does OpenEars use?
A: There are actually five libraries used by OpenEars-enabled projects, only one of which is the OpenEars framework, and you can see the license (which is very liberal) for CMU Pocketsphinx, CMU Sphinxbase, CMU Flite and CMUCLMTK here. You need to observe the terms of those licenses in your app as well as the OpenEars one, which shouldn’t be difficult since they are commercial-friendly licenses.
OpenEars is licensed under the Politepix Public License version 1.0. It gives you the right to use OpenEars to make apps for the App Store. You have some obligations (such as crediting the libraries involved, including OpenEars, either in your app on on its web page) so please read the license.[TOP]
Q: So I can use this in commercial, closed-source apps?
Q: Can I or should I reference OpenEars in my support/marketing/etc materials?
A: I’d love it if you want to talk about OpenEars in your marketing! If you want to discuss it in your support documents, just please do so in a way that it doesn’t cause any confusion for your endusers about where to seek support (i.e. it must be clear that you are responsible for supporting your app) and it doesn’t imply an endorsement of your app by Politepix or any of the maintainers of the libraries that OpenEars links to (unless one of those parties does actively endorse your project!).
Q: How can I trim down the size of the final binary for distribution?
A: There are instructions on doing this here.[TOP]
Q: But the framework is very large and I don’t want all of that file size added to my app
A: The framework is not added to your app. It is a static framework and only the parts of its code you link to are added to your app in binary form. The size of the framework is not related to the eventual size of your app; it only represents the overall size of all of the source that is inside of the framework. Not all of the source in the framework is even available via OpenEars’ API, so there is no scenario in which it is possible for your app to link to all of the code in the framework and cause the size of the part of your app binary which addresses the code in the framework to become as large as the framework.[TOP]
Q: But after I added OpenEars my app size got much larger.
A: If you have multiple architectures and you have bitcode on, this will be the result for you when you link to any significantly complex framework; you may only be linking to 12MB of compiled code but it is multiplied several times in the archive created due to all of the slices present. This is an app architecture issue on our platform, but to be honest, it isn’t a very important one in an era in which even a single photo taken by an iPhone is 3-12MB in size. Rather than extensively try to optimize the size of the app performing offline recognition in order to save the size of 1-4 photos on the phone, it is probably a healthier perspective to compare it to the data that will be saved from going over the network, which would exceed the amount added to your app in a relatively brief period of active use.[TOP]
A: Starting with OpenEars and plugins version 1.64, the -all_load linker flag is no longer supported and using it will prevent building. Any use of all_load that another library requires can be substituted with force_load and a reference to that library only, and this has been the case since early versions of Xcode 4, so there is no longer any reason at all to use all_load.
Q: Have any apps ever been rejected for using OpenEars?
A: I have never heard of any apps being rejected for using OpenEars, and I wouldn’t expect them to be since I’ve taken care to make sure OpenEars doesn’t do anything questionable, and where I’ve had any questions I’ve just written Apple and asked them for guidance directly. There is a very long list of apps that were (unsurprisingly) accepted that used OpenEars so it is fine to use OpenEars. I have heard of two apps in the last three years being rejected that linked to OpenEars, but they were not rejected because they linked to OpenEars or because of anything related to OpenEars, but because of other details of the apps that did not originate with OpenEars.
Something that is quite important as of iOS 7 for easy, painless app acceptance is that when you obtain a device capability permission, it is necessary to make it clear to the user what the permission is being used for — there can’t be any “stealth” usage of a device capability happening without it being transparent to the user. This is a great, positive development since we want to be building a user-respecting, forthright platform where users have a basis for trusting their apps. What that means in practice is if you perform speech recognition, and the user is asked to give microphone permission, there has to be some kind of explanation or indication in the app UI or description or introductory text that speech recognition is performed in the app. If you ask for mic permission and then perform speech recognition but there is nothing in the UI that would indicate that recognition is being performed, Apple will probably ask you to improve that so that the user knows what the mic stream is used for. OpenEars gives you UI hooks such as the decibel levels of incoming and outgoing speech so that it is easy for you to build a UI, but it isn’t a UI framework, so questions like how to best show the user what is being done with the mic stream are outside of the support that is given here, but I wanted to mention that this is something that you need to consider for your app now that there is a permission system and an Apple UI guideline for use of capabilities with permission.
Q: I still have a question, how do I get more support?
A: You can always ask for help in the forums and I’ll do my best to answer your question. Please turn on OpenEarsLogging and (if the issue relates to recognition) PocketsphinxController’s verbosePocketsphinx property before posting an issue so I have some information to troubleshoot from. Free private email support is not given for OpenEars, but you can purchase a support incident if you would like to discuss a question via private email. Forum support is free. Other emails regarding OpenEars (i.e. not support requests) can be sent via the contact form.
Q: What kinds of questions can I ask in the forums?
A: Questions about using the APIs available in OpenEars and its plugins to implement speech recognition or speech synthesis in native Objective-C apps in ways that are currently possible with their use (or questions about whether an application is currently possible), in order to get the best results with OpenEars and its plugins as they currently function. Sorry, it isn’t possible to help with the development of novel software or hardware techniques in the course of tech support for the OpenEars SDK. Relatedly, just like with the compiled software products of other firms, tech support doesn’t supply information about how the binary products were implemented and/or their internal implementation details.
Q: Why was a question, or a reply, or an account removed from the forums?
A: Posts and replies might be closed or removed if they are off-topic, too unclear to be able to help with, lacking information required for debugging after it has been requested a couple of times, or bringing an unconstructive tone to the troubleshooting process. Accounts will be removed if their posts habitually have these characteristics.
Q: Can I hire you to create an OpenEars-enabled app for me or adapt OpenEars, or consult on a speech project using OpenEars?
A: Sorry, Politepix does not do any consulting or contracting.
Q: Anything else?
RapidEars is a paid plugin for OpenEars™ that lets you perform live recognition on in-progress speech for times that you can't wait for the user to pause! Try out the RapidEars demo free of charge.Rejecto is a paid plugin for OpenEars™ that improves accuracy and UX by letting OpenEars™ ignore utterances of words that aren't in its vocabulary. Try out the Rejecto demo free of charge.perform recognition of fixed phrases using rules-based grammars? And RuleORama is a paid plugin that lets you use the same grammar format as stock OpenEars™, but the grammars are fast enough to work with RapidEars. Try out the RuleORama demo free of charge.NeatSpeech is a plugin for OpenEars™ that lets it do fast, high-quality offline speech synthesis which is compatible with iOS6.1, and even lets you edit the pronunciations of words! Try out the NeatSpeech demo free of charge.
Help with OpenEars™There is free public support for OpenEars™ in the OpenEars Forums, and you can also purchase private email support at the Politepix Shop. Most OpenEars™ questions are answered in the OpenEars support FAQ.