Reply To: Recognizer must be restarted after long utterances

February 26, 2015 at 9:28 pm #1025009

Participant

My app involves using speech recognition to have a conversation with a simulated person to be used for training purposes. Like training a person how to interview for a job. The user has possibly dozens of prompts to choose from and I break each one into separate words and put them each into the language model. Then when the recognizer returns, I analyze the results and figure out which prompt they were actually trying to say. I did this way (as opposed to entering each prompt in it’s whole form into the model) for lots of reasons, the main one being that people often don’t read exactly what’s on the screen. It works great.

This means that there are a lot of possible things the recognizer can return as a hypothesis. Since it’s a training tool, there are often more than one person using it at once. It’s also common for it be used in a room with lots of other people talking. It would be nice if there was, for example, a input level threshold for when the recognizer thinks it’s being spoken to so it can tell if the user is speaking right at the device or whether it’s trying to listen to someone across the room.

Or if there are a lot of words spoken that aren’t in the language model, interspersed between words that ARE in the model, it will know that.

I tried Rejecto, but it was too strict and it wouldn’t return recognition events when it should have.