Recognizer must be restarted after long utterances

Home Forums OpenEars Recognizer must be restarted after long utterances

Viewing 17 posts - 1 through 17 (of 17 total)

  • Author
    Posts
  • #1024998
    jugg1es
    Participant

    I’m stress-testing my app to see if it can handle a noisy environment on an iPad. I’m finding that when I have more than one long period of ‘listening’ (listening defined as when the recognizer detects speech until it detects a period of silence) together, that the recognizer seems to stop working all-together. I have to start and stop it to get it working again. It also results in memory warnings. Is there anything I can do to prevent this?

    #1024999
    jugg1es
    Participant

    And by long, I’m talking about over 20 seconds of just noise when it thinks someone is speaking

    #1025000
    Halle Winkler
    Politepix

    Hello,

    This is the first report I’ve received of this so I would need a bit more info to investigate it further. For further troubleshooting, the next step would be for you to create a minimal replication case to share with me, so I can see the exact thing you are seeing in your local setup:

    https://www.politepix.com/forums/topic/how-to-create-a-minimal-case-for-replication/

    Would that be possible?

    #1025001
    jugg1es
    Participant

    Dang, I was hoping for an easy answer. I’ll do my best to see if I can get that to you.

    #1025002
    Halle Winkler
    Politepix

    Thanks!

    #1025003
    jugg1es
    Participant

    Meh, I can’t get it to fail using the sample app. If/when I do, I’ll send it. Thanks for responding.

    #1025004
    Halle Winkler
    Politepix

    OK, that’s part of the purpose of the replication case – it can also indicate when it is an interaction with a different part of the app, so you can find out whether the troubleshooting should be directed at an interaction within the app rather than something related to the library. Do you have other audio (or video) objects operating at the same time?

    #1025005
    jugg1es
    Participant

    Yea I have all kinds of stuff going on, but not when the recognizer is active. I definitely narrowed the problem down to the recognizer itself but I also can’t get it to fail every time, even in my app. Overall it works great, but occasionally it will totally crash out on me. I’m going to add some timeouts and features to detect excessive noise and leave it at that until I can pinpoint the situation where it happens.

    #1025006
    Halle Winkler
    Politepix

    OK, I’ll take a replication case whenever you have one for me.

    #1025007
    jugg1es
    Participant

    In case you were looking for some unsolicited advice, a paid plugin that is able to detect when the recognizer is listening but user isn’t trying to speak to the device would be very useful.

    #1025008
    Halle Winkler
    Politepix

    Sure, for what UI/UX purpose?

    #1025009
    jugg1es
    Participant

    My app involves using speech recognition to have a conversation with a simulated person to be used for training purposes. Like training a person how to interview for a job. The user has possibly dozens of prompts to choose from and I break each one into separate words and put them each into the language model. Then when the recognizer returns, I analyze the results and figure out which prompt they were actually trying to say. I did this way (as opposed to entering each prompt in it’s whole form into the model) for lots of reasons, the main one being that people often don’t read exactly what’s on the screen. It works great.

    This means that there are a lot of possible things the recognizer can return as a hypothesis. Since it’s a training tool, there are often more than one person using it at once. It’s also common for it be used in a room with lots of other people talking. It would be nice if there was, for example, a input level threshold for when the recognizer thinks it’s being spoken to so it can tell if the user is speaking right at the device or whether it’s trying to listen to someone across the room.

    Or if there are a lot of words spoken that aren’t in the language model, interspersed between words that ARE in the model, it will know that.

    I tried Rejecto, but it was too strict and it wouldn’t return recognition events when it should have.

    #1025010
    Halle Winkler
    Politepix

    Did you know that you can turn down Rejecto’s weighting so it’s a bit less aggressive? Other options which don’t involve having to use a paid plugin necessarily are raising the vadThreshold value (this will exclude more incidental sounds) or using a grammar instead of a language model, which will only allow recognition on utterances that fit the grammar rules:

    https://www.politepix.com/2014/04/10/openears-1-7-introducing-dynamic-grammar-generation/

    This can be done with either stock OpenEars, or RuleORama for much faster grammars. Maybe one of these options can help you with getting a better user experience for your current app.

    #1025011
    jugg1es
    Participant

    Oh, I did not know about the vadThreshold. I will definitely play around with that.

    I did play around with Rejecto’s weighting but I could never get it quite right. Since I’m doing a lot of processing on my end to determine whether a user is actually speaking to the software, I felt more comfortable having control than not receiving the event at all.

    I’d love to use grammar rules, but, like I said, people rarely speak exactly what’s on the screen. So if the prompt is this:

    Hello Molly, how are you doing today?

    Users might actually say this (this happens way more often than you might think)

    Molly Hello, how you doing today?

    I were to use grammar rules, this wouldn’t be recognized.

    Thanks for the tip on the vadThreshold, that might be just what I need.

    #1025012
    Halle Winkler
    Politepix

    You’re welcome, I hope it’s a helpful addition to your toolkit.

    #1025013
    jugg1es
    Participant

    Yea, that worked great. I’m really surprised you don’t charge at all for OpenEars and the support you give on these forums. Do you have a ‘donate’ button anywhere?

    #1025053
    Halle Winkler
    Politepix

    I’m glad that helped! No worries – other people with similar questions will find this discussion or I can point them to it, so it provides a useful support resource, and it also might be the way they find out about a plugin that solves a problem for them, so it’s fine. If you want to do something nice, it’s always very helpful to get a shoutout here or there (Twitter, blog posts, whatever) so folks know that you’re having a good experience with the SDK.

Viewing 17 posts - 1 through 17 (of 17 total)
  • You must be logged in to reply to this topic.