NeatSpeech British pronunciations

Home Forums OpenEars plugins NeatSpeech British pronunciations

Viewing 6 posts - 1 through 6 (of 6 total)

  • Author
    Posts
  • #1031975

    zammer
    Participant

    I’m trying to use NeatSpeech for a UK oriented product but I keep finding pronunciations that are miles away from the real word. My guess is that the CMU phonemes don’t translate brilliantly in some cases. I also noticed that the cmudict is version 0.4 rather than current 0.7. Do you have any knowledge of how to update the dictionary or source a British accented one?

    I found https://github.com/rhdunn/cmudict-tools – would it be a Festival format?

    Thanks

    #1031976

    Halle Winkler
    Politepix

    Hi,

    There are two issues – one is how the phonemes are said (this should be correctly handled by the UK voices) and the other is which phonemes the local pronunciation contains and/or are accented (this can be quite different, for instance in the words aluminum or garage). The CMU dictionary is a US speech dictionary, so as far as I know there is no version of it which will preference UK pronunciations over US ones. It sounds to me like your issue is with the latter case, is that correct?

    #1031977

    zammer
    Participant

    Hi,

    Yes, I think you are correct that it is the second issue. I edited a couple of particularly bad words:

    E.g. Awkward went from:
    (“awkward” nil (((ax) 0) ((k w er d) 0)))
    to
    (“awkward” nil (((ao r) 1) ((k w er d) 0)))
    and was much improved.

    Do you know of a version of the CMU dictionary with UK pronunciations at all?

    Thanks,

    Martin

    #1031978

    zammer
    Participant

    Sorry, you answered my question already…

    My only clue is that apparently there is a conversion table to “en-GB-x-rp” for CMU but I couldn’t get any further than that.

    Martin

    #1031981

    Halle Winkler
    Politepix

    My strong suspicion is that that table is for converting a voice that uses US phonemes to sound like received pronunciation, because that could be done tolerably by a table (e.g. “er” at the end of a word always sounds like a US “ah”), while converting words which actually have different pronunciations would have to be a long list of exceptional cases, including different accented syllables.

    #1031982

    Halle Winkler
    Politepix

    When I’ve had problems like these (things I wanted to fix by hand which were too many and too distributed across the language), this is how I’ve handled it. 1) I’ve searched for some canonical list of $WORDS, where in this case they are the list of words pronounced differently in US and UK English at the word level, and 2) got a list of the 5,000 most-used words in the language overall, and 3) taken the intersection of these two lists. At that point you may have a short enough list, but relevant enough, to make it not too terrible of a job to change them manually. If it’s still too much you can reduce 5,000 to something smaller, or vice versa if you discover it isn’t as many common words as you thought.

Viewing 6 posts - 1 through 6 (of 6 total)
  • You must be logged in to reply to this topic.