Pronunciation Scoring algorithm comparison

Home Forums OpenEars Pronunciation Scoring algorithm comparison

Tagged: ,

Viewing 5 posts - 1 through 5 (of 5 total)

  • Author
    Posts
  • #1015785
    tmtariq
    Participant

    Hello, I am just curios to know if the Pronunciation Scoring algorithm used in OpenEar is same as the following python project.

    http://cmusphinx.sourceforge.net/2012/08/gsoc-2012-pronunciation-evaluation-using-cmusphinx-%E2%80%93-project-conclusions/
    http://cmusphinx.svn.sourceforge.net/viewvc/cmusphinx/branches/speecheval/ronanki/

    But the Scoring results from the python project seems more accurate and correct. Can I use this python project with OpenEar in iOS application.

    Thanks!

    #1015799
    Halle Winkler
    Politepix

    Welcome,

    Sorry, I don’t know which algorithm they used and whether it’s the same (or, without a direct comparison of identical data and information on what the actual results are to compare it with, if there is a difference in scoring accuracy). There is no Python support in iOS, so I think your option would only be to port it to C/C++/Objective-C if you discover that it is a better algorithm.

    #1015800
    tmtariq
    Participant

    Thank you,

    Python code works as below for pronunciation score. I just want to know your scoring routines works like below (no need to be exactly the same). This way I will not required to port all python code (or even a single line). It will save a lot of my time.

    Python Scoring Routines Text Dependent:
    This method is based on exemplars for each phrase. Initially, mean acoustic score, mean duration along with deviations are calculated for each of the phone in the phrase based on exemplar recordings (acoustic scores for each phoneme from audio file can be calculated using pocketsphinx). Now, given the test recording, each phone in the phrase is then compared with exemplar statistics. After that, z-scores are calculated and then normalized scores are calculated based on maximum and minimum of z-scores from exemplar recordings. All phone scores are aggregated to get word score and then all word scores are aggregated with POS weight to get complete phrase score.

    You can find more details here:
    http://goo.gl/cvaAH

    Regards,

    #1015801
    tmtariq
    Participant

    Also, Please let me know if the OpenEar do word based recognition or phonics based recognition?

    Regards,

    #1015802
    Halle Winkler
    Politepix

    I don’t have time right now to read the full project notes, but that project refers to new ground for the engine — pronunciation scoring is a different task than word scoring. I don’t know of any recognition engines that do recognition of phonemes with pronunciation scoring as a default function rather than word recognition that can be repurposed for pronunciation scoring. My understanding is that Sphinx 3 has the most functionality for delivering recognition as phonemes rather than words.

Viewing 5 posts - 1 through 5 (of 5 total)
  • You must be logged in to reply to this topic.