Live audio match and scoring

Tagged: match recognition live comparison to wav file recording

This topic has 5 replies, 2 voices, and was last updated 8 years, 5 months ago by Halle Winkler.

Viewing 6 posts - 1 through 6 (of 6 total)

Advertisement: “Did you know OpenEars™ can use rules-based grammars to recognize fixed phrases? And RuleORama lets you use them with RapidEars!”

Author

Posts
November 13, 2015 at 4:43 am #1027302

jkauffman24
Participant

I am looking to build an application that takes a recording created by the user (presumably with SavethatWave) and then compares that real time with the speech occurring (presumably with RapidEars) to find a match. When a match is found, I’m looking to increment a score.

In looking through the forum I saw that I can use runRecognitionOnWavFileAtPath:usingLanguageModelAtPath:dictionaryAtPath:languageModelIsJSGF: – but how do I do the comparison, or is that runRecognition running the realtime comparison of the wav file to the live speech? Just looking for some more context or directional help.

Thanks!

November 13, 2015 at 11:21 am #1027304

Halle Winkler
Politepix

Welcome,

Can you describe the intended implementation and results in a bit more detail? OpenEars isn’t designed to do comparisons between audio files so I think I’m not quite understanding the goal yet.

The aspect that I’m least clear about is the time or ordering relationship between 1) using SaveThatWave to make a recording (by definition not occurring in real time since the recording can’t exist until the utterance is fully complete, and presumably not the recording from the realtime session because that one is intended to run a comparison with WAV produced from this phase) and 2) the realtime recognition (happening while a mic utterance is in progress, so not using the WAV recording) and 3) running WAV-based recognition on the original WAV (happening in a third time period altogether since it requires the existence of a fully-recorded WAV and requires that realtime recognition isn’t in progress).

I’ll be happy to give some advice if I can, once I get the goal a little better. Or maybe there is a simpler way to accomplish the underlying speech interface task, if you want to elaborate on that.

November 13, 2015 at 8:22 pm #1027306

jkauffman24
Participant

The intent is to use save the wave to pre-record a specific word. Then, the app would “listen” to a live conversation and increment a score when it hears a match.

I.e. The app prompts me to record myself saying ‘openears’ and then during a subsequent conversation (denoted by the user selecting a Start button), the app will give me a point every time I say ‘openears’.

November 13, 2015 at 8:28 pm #1027307

Halle Winkler
Politepix

Why not detect whether the user has said “OpenEars” by creating a language model and dictionary containing the word “OpenEars”?

November 15, 2015 at 3:07 am #1027315

jkauffman24
Participant

I didn’t realize there was an option that was that simple – so you’re saying that I can create a custom dictionary for whatever words I want to match against – the user doesn’t have to provide an Audio sample of their own to match against? What are the accuracy rates across different accents?

After it finds a match how do I programatically track when the words are said?

November 15, 2015 at 11:12 am #1027316

Halle Winkler
Politepix

Hi,

Correct, easily making statistical speech models and grammars for you is one of the most important functions of the framework. Take a look at the sample app and run through the tutorial here, which should answer your questions as well as provide you with a starting custom implementation: https://politepix.com/openears/tutorial

If you have the same (or new) questions afterwards I’ll be happy to answer them.
Author

Posts

Viewing 6 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic.