Large Number Grammar

Home Forums OpenEars Large Number Grammar

Viewing 12 posts - 1 through 12 (of 12 total)

  • Author
    Posts
  • #13072
    radox1
    Participant

    I am working on a financial application where I would like the user to be able to input large numbers using one voice. For example I would like a user to be able to input their salary as “twenty eight thousand five hundred” rather than “two eight five zero zero zero”.

    I have looked around online for a number grammar which can support this but I have been unable to find one. As I imagine this is a common requirement I thought a grammar for this would be readily available. Could someone please point me in the right direction?

    Thanks in advance.

    #13073
    Halle Winkler
    Politepix

    Hello,

    I’m not aware of a pre-rolled grammar for large numbers, sorry. I generally recommend not using JSGF due to slow performance and what seems like slightly buggy recognition in the engine. Have you tried generating a text corpus of number words and creating your own ARPA language model (like in this blog post: https://www.politepix.com/2012/11/02/openears-tips-1-create-a-language-model-before-runtime-from-a-text-file/)?

    #13074
    radox1
    Participant

    Hi Halle,

    Thanks for the link. The text corpus to detect all of the possible numbers is going to be fairly large. Do you have any advice on then going back from the recognised strings to numbers?

    Ben

    #13075
    Halle Winkler
    Politepix

    I’ve never thought about this task so this is not coming from a position of experience with it, but if the maximum is (for instance) 999,999 this seems to me that it would need [0-9], a set of tens incrementing by ten going up to “90”, a set of hundreds incrementing by 100 going up to “900”, and a set of thousands incrementing by 1000 going up to “9000”, so a model with a base set of 40 unigrams which have equal probability of being found in a particular bigram or trigram. Out of that you can make 999,999 with the available words “nine hundred”, “ninety” “nine thousand” “nine hundred” “ninety” “nine”. It seems that interpreting this back into digits should be possible to construct a ruleset for since there are only a few variations on correct statement of a number in English. I can also see why you would want a grammar, however, to have a rules-based recognition that you can be more confident about processing backwards into digits.

    #13077
    radox1
    Participant

    I have tried to implement something similar and it seems to be working fairly well.

    I have included “and” as this is often used within numbers. “nine hundred and eight one”.

    One issue I am having is that “thirty” “fifty” and “eighty” are often wrongly identified as each other.

    I will try adding “one hundred”, “two hundred” … into the grammar as this should make it slightly easier to parse.

    –Current grammar—

    ONE
    TWO
    THREE
    FOUR
    FIVE
    SIX
    SEVEN
    EIGHT
    NINE
    TEN
    ELEVEN
    TWELVE
    THIRTEEN
    FOURTEEN
    FIFTEEN
    SIXTEEN
    SEVENTEEN
    EIGHTEEN
    NINETEEN
    TWENTY
    THIRTY
    FOURTY
    FIFTY
    SIXTY
    SEVENTY
    EIGHTY
    NINETY
    HUNDRED
    THOUSAND
    MILLION
    POUND
    PEE
    PENCE
    AND

    #13078
    Halle Winkler
    Politepix

    Looks like a good start. There might be an accent bias hurting accuracy since the default acoustic model is comprised of US speech. You might want to adapt the model to a variety of UK accents using your number set as the speech corpus. This may get you some improvement with the thirty/fifty/eighty issue.

    #13080
    radox1
    Participant

    Halle how would I go about using my number set as a speech corpus?

    #13081
    Halle Winkler
    Politepix

    To learn about how an acoustic model is adapted you probably want to check out the CMU Sphinx project, since that isn’t something I can support from here beyond pointing you to the docs at the CMU project since it isn’t part of OpenEars: http://cmusphinx.sourceforge.net/wiki/tutorialadapt

    The corpus of speech you would want to use in order to adapt to a UK accent for your particular application would have a number of different speakers with the desired UK accents saying the words for which you want more accuracy (I would have them say all of the words in your language model). Basically you will want to make recordings of your speakers saying the words and then you will use the acoustic model adaptation method linked above to integrate their speech into the acoustic model. The result ought to be that your adapted acoustic model will get better at recognizing/distinguishing between those words in the accents you include. The acoustic model you end up with can be used with OpenEars just like the default acoustic model.

    #13082
    radox1
    Participant

    Thanks for the link. I will definitely look into that!

    One more thing. Is there a way to queue things to be spoken?

    Currently if I request the fliteController to say something whilst it is already talking it ignore it. Ideally i’d like it to queue the request and start it when the previous speech has stopped. Will I need to manually implement this behaviour?

    #13083
    Halle Winkler
    Politepix

    This isn’t a feature of FliteController, but NeatSpeech operates with a queue and it renders the new speech in the background so that it generally starts playing instantly when the previous speech is complete, and it has a male and female UK voice.

    #1020916
    Halle Winkler
    Politepix

    Please check out the new dynamic generation language for OpenEars added with version 1.7: https://www.politepix.com/2014/04/10/openears-1-7-introducing-dynamic-grammar-generation/

    #1021025
    Halle Winkler
    Politepix

    In addition to the dynamic grammar generation that has been added to stock OpenEars in version 1.7, there is also a new plugin called RuleORama which can use the same API in order to generate grammars which are a bit faster and compatible with RapidEars: https://www.politepix.com/ruleorama

Viewing 12 posts - 1 through 12 (of 12 total)
  • You must be logged in to reply to this topic.