February 26, 2015 at 8:16 pm #1024998
I’m stress-testing my app to see if it can handle a noisy environment on an iPad. I’m finding that when I have more than one long period of ‘listening’ (listening defined as when the recognizer detects speech until it detects a period of silence) together, that the recognizer seems to stop working all-together. I have to start and stop it to get it working again. It also results in memory warnings. Is there anything I can do to prevent this?February 26, 2015 at 8:17 pm #1024999
And by long, I’m talking about over 20 seconds of just noise when it thinks someone is speakingFebruary 26, 2015 at 8:20 pm #1025000
This is the first report I’ve received of this so I would need a bit more info to investigate it further. For further troubleshooting, the next step would be for you to create a minimal replication case to share with me, so I can see the exact thing you are seeing in your local setup:
Would that be possible?February 26, 2015 at 8:27 pm #1025001
Dang, I was hoping for an easy answer. I’ll do my best to see if I can get that to you.February 26, 2015 at 8:37 pm #1025002
Thanks!February 26, 2015 at 8:40 pm #1025003
Meh, I can’t get it to fail using the sample app. If/when I do, I’ll send it. Thanks for responding.February 26, 2015 at 8:45 pm #1025004
OK, that’s part of the purpose of the replication case – it can also indicate when it is an interaction with a different part of the app, so you can find out whether the troubleshooting should be directed at an interaction within the app rather than something related to the library. Do you have other audio (or video) objects operating at the same time?February 26, 2015 at 8:54 pm #1025005
Yea I have all kinds of stuff going on, but not when the recognizer is active. I definitely narrowed the problem down to the recognizer itself but I also can’t get it to fail every time, even in my app. Overall it works great, but occasionally it will totally crash out on me. I’m going to add some timeouts and features to detect excessive noise and leave it at that until I can pinpoint the situation where it happens.February 26, 2015 at 8:58 pm #1025006
OK, I’ll take a replication case whenever you have one for me.February 26, 2015 at 9:07 pm #1025007
In case you were looking for some unsolicited advice, a paid plugin that is able to detect when the recognizer is listening but user isn’t trying to speak to the device would be very useful.February 26, 2015 at 9:10 pm #1025008
Sure, for what UI/UX purpose?February 26, 2015 at 9:28 pm #1025009
My app involves using speech recognition to have a conversation with a simulated person to be used for training purposes. Like training a person how to interview for a job. The user has possibly dozens of prompts to choose from and I break each one into separate words and put them each into the language model. Then when the recognizer returns, I analyze the results and figure out which prompt they were actually trying to say. I did this way (as opposed to entering each prompt in it’s whole form into the model) for lots of reasons, the main one being that people often don’t read exactly what’s on the screen. It works great.
This means that there are a lot of possible things the recognizer can return as a hypothesis. Since it’s a training tool, there are often more than one person using it at once. It’s also common for it be used in a room with lots of other people talking. It would be nice if there was, for example, a input level threshold for when the recognizer thinks it’s being spoken to so it can tell if the user is speaking right at the device or whether it’s trying to listen to someone across the room.
Or if there are a lot of words spoken that aren’t in the language model, interspersed between words that ARE in the model, it will know that.
I tried Rejecto, but it was too strict and it wouldn’t return recognition events when it should have.February 26, 2015 at 9:36 pm #1025010
Did you know that you can turn down Rejecto’s weighting so it’s a bit less aggressive? Other options which don’t involve having to use a paid plugin necessarily are raising the vadThreshold value (this will exclude more incidental sounds) or using a grammar instead of a language model, which will only allow recognition on utterances that fit the grammar rules:
This can be done with either stock OpenEars, or RuleORama for much faster grammars. Maybe one of these options can help you with getting a better user experience for your current app.February 26, 2015 at 9:50 pm #1025011
Oh, I did not know about the vadThreshold. I will definitely play around with that.
I did play around with Rejecto’s weighting but I could never get it quite right. Since I’m doing a lot of processing on my end to determine whether a user is actually speaking to the software, I felt more comfortable having control than not receiving the event at all.
I’d love to use grammar rules, but, like I said, people rarely speak exactly what’s on the screen. So if the prompt is this:
Hello Molly, how are you doing today?
Users might actually say this (this happens way more often than you might think)
Molly Hello, how you doing today?
I were to use grammar rules, this wouldn’t be recognized.
Thanks for the tip on the vadThreshold, that might be just what I need.February 26, 2015 at 9:55 pm #1025012
You’re welcome, I hope it’s a helpful addition to your toolkit.February 26, 2015 at 10:01 pm #1025013
Yea, that worked great. I’m really surprised you don’t charge at all for OpenEars and the support you give on these forums. Do you have a ‘donate’ button anywhere?February 27, 2015 at 9:47 am #1025053
I’m glad that helped! No worries – other people with similar questions will find this discussion or I can point them to it, so it provides a useful support resource, and it also might be the way they find out about a plugin that solves a problem for them, so it’s fine. If you want to do something nice, it’s always very helpful to get a shoutout here or there (Twitter, blog posts, whatever) so folks know that you’re having a good experience with the SDK.
- You must be logged in to reply to this topic.