Can utterances only bring back what is in dictionary?

Home Forums OpenEars Can utterances only bring back what is in dictionary?

Tagged: 

Viewing 21 posts - 1 through 21 (of 21 total)

  • Author
    Posts
  • #1032011
    jeffbonasso
    Participant

    Hello. It seems that if I have a dictionary like follows:

    [
    “GO BACK”,
    “READ CURRENT ITEM”,
    “READ ITEM”,
    “CHECK”,
    “CHECK ITEM”,
    “NEXT”,
    “NEXT ITEM”,
    “SKIP”,
    “SKIP ITEM”,
    “LOCATE”,
    “LOCATE ITEM”,
    “EMERGENCY”,
    “EMERGENCY LIST”,
    “OPEN COMMENTS”,
    “CLOSE COMMENTS”,
    “SHOW COMMENTS”,
    “HIDE COMMENTS”,
    “OPEN NOTES”,
    “CLOSE NOTES”,
    “SHOW NOTES”,
    “HIDE NOTES”,
    “OPEN LOGBOOK”,
    “CLOSE LOGBOOK”,
    “SHOW LOGBOOK”,
    “HIDE LOGBOOK”,
    “OPEN CLOCK”,
    “CLOSE CLOCK”,
    “SHOW CLOCK”,
    “HIDE CLOCK”,
    “NEXT LIST”,
    “PREVIOUS LIST”,
    “NEXT SECTION”,
    “PREVIOUS SECTION”,
    “SPEAK”,
    “SILENCE”,
    “READ SECTION”,
    “READ CURRENT SECTION”,
    “CHECK SECTION”,
    “CHECK CURRENT SECTION”,
    “OPEN NAV OVERLAY”,
    “CLOSE NAV OVERLAY”,
    “SHOW NAV OVERLAY”,
    “HIDE NAV OVERLAY”,
    “QUIDNUNC”
    ];

    …that when it is listening, it will bring back utterances that aren’t specifically in this dictionary like “BACK”, “CURRENT”, “NAV NOTES”, “NAV ITEM NO OUT”

    Is there a way to only have utterances come back that only are in the dictionary and the ones in the dictionary are weighted higher than ones that aren’t? In my case it will bring back an utterance like “BACK” when I say “CHECK”, but I don’t have “BACK” in the dictionary by itself, I specifically added “GO BACK” to try to solve it from making the wrong choice.

    #1032012
    Halle Winkler
    Politepix

    Welcome,

    Yes, take a look in the docs for information about grammars in OpenEars (versus language models, which you are using above) and after looking into that and trying it out, you may possible also want to investigate the use of RuleORama in case you need it in realtime.

    #1032013
    jeffbonasso
    Participant

    Thanks! I am using Rejecto which seems to only have the method…

    generateRejectingLanguageModelFromArray

    Is there a similar one for generating a grammar while using Rejecto?

    #1032014
    Halle Winkler
    Politepix

    No, Rejecto works with language models only. I would first start with the stock OpenEars grammar methods and then check out RuleORama if you need to use that grammar approach in realtime.

    #1032022
    jeffbonasso
    Participant

    Thanks Halle. So I am now using the grammar instead of language model and when I now say the phrases in the dictionary it is way more accurate than before which is great. One thing I am seeing though is because there is no Rejecto it seems to be overly aggressive in returning something in the dictionary even if what is being said is not even close to the phrases. Even when I say a single syllable, single word it will bring back phrases. Is there any strategies to mimic what something like Rejecto does or maybe I am missing something that can make things being said not trip it always wanting to bring back an utterance?

    #1032023
    Halle Winkler
    Politepix

    Hi,

    Make sure that the vadThreshold is set correctly.

    #1032024
    jeffbonasso
    Participant

    Yes. I put in slider in my app where at anytime I change it from 1 to 5 in 0.1 increments. When I get to 4.2 or above it doesn’t recognize most words. At 4 it usually recognizes words but it still trips up a lot when saying words where there definitely are no phrases anywhere near what is being said, .

    #1032025
    Halle Winkler
    Politepix

    OK, that’s surprising, but vadThreshold would be the available way to address this. If the utterances you are using in the grammar are particularly short, you may wish to make them a bit longer so they are more distinct from each other and less easily substituted for other utterances.

    #1032026
    jeffbonasso
    Participant

    Almost every phrase is at least two words. Most are two words or more. This is one dictionary…

    AFFIRMATIVE,
    NEGATIVE,
    “GO BACK”,
    “READ CURRENT ITEM”,
    “READ ITEM”,
    CHECK,
    “CHECK ITEM”,
    NEXT,
    “NEXT ITEM”,
    SKIP,
    “SKIP ITEM”,
    LOCATE,
    “LOCATE ITEM”,
    EMERGENCY,
    “EMERGENCY LIST”,
    “OPEN COMMENTS”,
    “CLOSE COMMENTS”,
    “SHOW COMMENTS”,
    “HIDE COMMENTS”,
    “OPEN NOTES”,
    “CLOSE NOTES”,
    “SHOW NOTES”,
    “HIDE NOTES”,
    “OPEN LOGBOOK”,
    “CLOSE LOGBOOK”,
    “SHOW LOGBOOK”,
    “HIDE LOGBOOK”,
    “OPEN CLOCK”,
    “CLOSE CLOCK”,
    “SHOW CLOCK”,
    “HIDE CLOCK”,
    “NEXT LIST”,
    “PREVIOUS LIST”,
    “NEXT SECTION”,
    “PREVIOUS SECTION”,
    “MIRA SILENCE”,
    “MIRA SHUT UP”,
    “MIRA BE QUIET”,
    “READ SECTION”,
    “READ CURRENT SECTION”,
    “CHECK SECTION”,
    “CHECK CURRENT SECTION”,
    “RESET CHECKLIST”,
    “RESET CURRENT CHECKLIST”,
    “RESET LIST”,
    “RESET CURRENT LIST”,
    “RESET SECTION”,
    “RESET CURRENT SECTION”,
    “OPEN NAV OVERLAY”,
    “CLOSE NAV OVERLAY”,
    “SHOW NAV OVERLAY”,
    “HIDE NAV OVERLAY”,
    “OPEN BIG CHECK OVERLAY”,
    “CLOSE BIG CHECK OVERLAY”,
    “SHOW BIG CHECK OVERLAY”,
    “HIDE BIG CHECK OVERLAY”,
    QUIDNUNC,
    “MIRA PREFLIGHT”,
    “MIRA INITIAL”,
    “MIRA EXTERIOR”,
    “MIRA INTERIOR”,
    “MIRA START”,
    “MIRA TAXI”,
    “MIRA RUN-UP”,
    “MIRA INFLIGHT”,
    “MIRA PRE-TAKEOFF”,
    “MIRA TAKEOFF”,
    “MIRA CLIMB”,
    “MIRA CRUISE”,
    “MIRA DESCENT”,
    “MIRA PRE-LANDING”,
    “MIRA LANDING”,
    “MIRA GO-AROUND”,
    “MIRA POSTFLIGHT”,
    “MIRA AFTER LANDING”,
    “MIRA SECURING”,
    “MIRA SPEEDS”,
    “MIRA QUICK SPEEDS”,
    “MIRA NORMAL OPERATION”,
    “MIRA REFERENCE”,
    “MIRA SPECS”,
    “MIRA FREQUENCIES”,
    “MIRA EMERGENCY”,
    “MIRA HELP”,
    “MIRA MAYDAY”,
    “MIRA POWER LOSS ON TAKEOFF”,
    “MIRA TAKEOFF POWER LOSS”,
    “MIRA POWER LOSS INFLIGHT”,
    “MIRA INFLIGHT POWER LOSS”,
    “MIRA NO RESTART WITH TIME”,
    “MIRA ELECTRICAL FIRE”,
    “MIRA ENGINE FIRE ON STARTUP”,
    “MIRA STARTUP ENGINE FIRE”,
    “MIRA ENGINE FIRE INFLIGHT”,
    “MIRA INFLIGHT ENGINE FIRE”,
    “MIRA ICING”,
    “MIRA EXCESS CHARGE”,
    “MIRA LOW VOLTAGE”,
    “MIRA RADIO OUT”

    I do understand I could use the the following format to build more of a hierarchy, but I am not sure if doing that will improve the results…

    @{
    ThisWillBeSaidOnce : @[
    @{ OneOfTheseCanBeSaidOnce : @[@”HELLO COMPUTER”, @”GREETINGS ROBOT”]},
    @{ OneOfTheseWillBeSaidOnce : @[@”DO THE FOLLOWING”, @”INSTRUCTION”]},
    @{ OneOfTheseWillBeSaidOnce : @[@”GO”, @”MOVE”]},
    @{ThisWillBeSaidWithOptionalRepetitions : @[
    @{ OneOfTheseWillBeSaidOnce : @[@”10″, @”20″,@”30″]},
    @{ OneOfTheseWillBeSaidOnce : @[@”LEFT”, @”RIGHT”, @”FORWARD”]}
    ]},
    @{ OneOfTheseWillBeSaidOnce : @[@”EXECUTE”, @”DO IT”]},
    @{ ThisCanBeSaidOnce : @[@”THANK YOU”]}
    ]
    };

    #1032027
    Halle Winkler
    Politepix

    Hi,

    Only the part at the end of your previous post beginning with ThisWillBeSaidOnce is a grammar; the other is still a language model.

    #1032028
    jeffbonasso
    Participant

    I just copied the strings from the debug output. I am passing those to the generate grammar method in an NSDictiinary under a ThisWillBeSaidOnce and using JSGF to YES when listening.

    #1032029
    Halle Winkler
    Politepix

    OK, I would get rid of the one-syllable single-word entries and see if it improves.

    #1032032
    jeffbonasso
    Participant

    OK, I have done that. Still very weird behavior. I am testing this in pretty ideal conditions with no background noise and a very good headset and noise cancelling mic. When I say the words that appear in the dictionary it works flawlessly. When I saw am just saying other long phrases, it trips almost every time.

    “THAT IS NOT VERY GOOD” and it came back with “NEGATIVE”
    “THIS IS A TEST” and it came back with “MIRA SECURING”
    “THIS IS A TEST” and it came back with “MIRA SPEEDS”
    “THERE REALLY IS SOMETHING WRONG” and it came back with “MIRA RADIO-OUT”
    “THAT IS WEIRD” and it came back with “READ ITEM”

    #1032033
    Halle Winkler
    Politepix

    That is unexpected, but I’m afraid I don’t have more suggestions.

    #1032034
    jeffbonasso
    Participant

    Since there is no rejecto for grammar, are there any strategies that could be done to simulate rejecto. One thing I just tried was I added every letter of the alphabet to the dictionary, and now almost anytime time it hears anything it picks one of those unless I say a phrase specifically in the dictionary which definitely is helping a lot.

    Still confused why using a grammar it seems like it always wants to match with something in the dictionary. When I say almost anything it is always coming back now with one of the letters. It seems like there would be something that if in the time it started recognizing to the silence delay if there was what was clearly many syllables and words being said that it wouldn’t match to something with one syllable.

    #1032035
    Halle Winkler
    Politepix

    Sorry, as I said, that is an unexpected result and I don’t have further suggestions for it. Take a look at the post Please read before you post – how to troubleshoot and provide logging info here so you can see the info needed in order to request any more in-depth troubleshooting.

    #1032186
    olegnaumenko
    Participant

    Also using Grammar method currently,
    Having the same problem (wrong results while phrase not in dict is being said).

    Is there any help in probability / score numbers? Is there any way to get the credibility index of current hypothesis / utterance? This would greatly help,

    Will any of Your paid plugins help improve this?

    #1032187
    Halle Winkler
    Politepix

    Welcome,

    Like for the post before yours, this would need logging output to be able to help with – please take a look at the link I provided in my previous response, thanks.

    #1032188
    olegnaumenko
    Participant

    OpenEars 2.506, iPhone 5s, speaking into built-in mic from 8..10 inch distance, using grammar mode with vocabulary:

    @{OneOfTheseWillBeSaidOnce:@[@”HELLO ROBOT”,
    @”HEY THERE”,
    @”GREEN CROCODILE”,
    @”HELLO PEOPLE”,
    @”HEY YOU NERD”,
    @”EMERGENCY SITUATION”]}

    when saying, “I didn’t say that” I get “HEY THERE”. Often when I pronounce “Hello People” I get “HEY THERE”. I understand that these sound similar. Is there a way to get the probability for detection so I can filter out hypotheses with low credibility? Or is probability unavailable in JSGF mode?

    the log goes here:

    2018-01-03 18:40:15.732414+0200 OpenEarsTest[2672:954570] Starting OpenEars logging for OpenEars version 2.506 on 64-bit device (or build): iPhone running iOS version: 10.300000
    2018-01-03 18:40:15.756330+0200 OpenEarsTest[2672:954570] Since there is no cached version, loading the language model lookup list for the acoustic model called AcousticModelEnglish
    2018-01-03 18:40:15.813194+0200 OpenEarsTest[2672:954570] I’m done running performDictionaryLookup and it took 0.039237 seconds
    2018-01-03 18:40:15.856756+0200 OpenEarsTest[2672:954570] Creating shared instance of OEPocketsphinxController
    2018-01-03 18:40:15.873651+0200 OpenEarsTest[2672:954570] Attempting to start listening session from startListeningWithLanguageModelAtPath:
    2018-01-03 18:40:15.880222+0200 OpenEarsTest[2672:954570] User gave mic permission for this app.
    2018-01-03 18:40:15.882161+0200 OpenEarsTest[2672:954570] setSecondsOfSilence wasn’t set, using default of 0.700000.
    2018-01-03 18:40:15.883578+0200 OpenEarsTest[2672:954626] Starting listening.
    2018-01-03 18:40:15.883754+0200 OpenEarsTest[2672:954626] About to set up audio session
    2018-01-03 18:40:16.130446+0200 OpenEarsTest[2672:954626] Creating audio session with default settings.
    2018-01-03 18:40:16.130549+0200 OpenEarsTest[2672:954626] Done setting audio session category.
    2018-01-03 18:40:16.131371+0200 OpenEarsTest[2672:954638] Audio route has changed for the following reason:
    2018-01-03 18:40:16.135606+0200 OpenEarsTest[2672:954638] There was a category change. The new category is AVAudioSessionCategoryPlayAndRecord
    2018-01-03 18:40:16.200989+0200 OpenEarsTest[2672:954638] This is not a case in which OpenEars notifies of a route change. At the close of this method, the new audio route will be <Input route or routes: “MicrophoneBuiltIn”. Output route or routes: “Speaker”>. The previous route before changing to this route was “<AVAudioSessionRouteDescription: 0x174007e80,
    inputs = (
    “<AVAudioSessionPortDescription: 0x174008160, type = MicrophoneBuiltIn; name = iPhone Microphone; UID = Built-In Microphone; selectedDataSource = Front>”
    );
    outputs = (
    “<AVAudioSessionPortDescription: 0x174007d40, type = Speaker; name = Speaker; UID = Speaker; selectedDataSource = (null)>”
    )>”.
    2018-01-03 18:40:16.208455+0200 OpenEarsTest[2672:954626] Done setting preferred sample rate to 16000.000000 – now the real sample rate is 16000.000000
    2018-01-03 18:40:16.209213+0200 OpenEarsTest[2672:954626] number of channels is already the preferred number of 1 so not setting it.
    2018-01-03 18:40:16.210498+0200 OpenEarsTest[2672:954626] Done setting session’s preferred I/O buffer duration to 0.128000 – now the actual buffer duration is 0.128000
    2018-01-03 18:40:16.210616+0200 OpenEarsTest[2672:954626] Done setting up audio session
    2018-01-03 18:40:16.212290+0200 OpenEarsTest[2672:954638] Audio route has changed for the following reason:
    2018-01-03 18:40:16.214772+0200 OpenEarsTest[2672:954626] About to set up audio IO unit in a session with a sample rate of 16000.000000, a channel number of 1 and a buffer duration of 0.128000.
    2018-01-03 18:40:16.234177+0200 OpenEarsTest[2672:954638] There was a category change. The new category is AVAudioSessionCategoryPlayAndRecord
    2018-01-03 18:40:16.239235+0200 OpenEarsTest[2672:954638] This is not a case in which OpenEars notifies of a route change. At the close of this method, the new audio route will be <Input route or routes: “MicrophoneBuiltIn”. Output route or routes: “Speaker”>. The previous route before changing to this route was “<AVAudioSessionRouteDescription: 0x174007d20,
    inputs = (
    “<AVAudioSessionPortDescription: 0x174007e90, type = MicrophoneBuiltIn; name = iPhone Microphone; UID = Built-In Microphone; selectedDataSource = Bottom>”
    );
    outputs = (
    “<AVAudioSessionPortDescription: 0x170008640, type = Receiver; name = Receiver; UID = Built-In Receiver; selectedDataSource = (null)>”
    )>”.
    2018-01-03 18:40:16.248024+0200 OpenEarsTest[2672:954626] Done setting up audio unit
    2018-01-03 18:40:16.248110+0200 OpenEarsTest[2672:954626] About to start audio IO unit
    2018-01-03 18:40:16.528044+0200 OpenEarsTest[2672:954626] Done starting audio unit
    INFO: pocketsphinx.c(145): Parsed model-specific feature parameters from /var/containers/Bundle/Application/A9C4195E-904F-44F1-906A-F193DC56484D/OpenEarsTest.app/AcousticModelEnglish.bundle/feat.params
    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -agc none none
    -agcthresh 2.0 2.000000e+00
    -allphone
    -allphone_ci no no
    -alpha 0.97 9.700000e-01
    -ascale 20.0 2.000000e+01
    -aw 1 1
    -backtrace no no
    -beam 1e-48 1.000000e-48
    -bestpath yes yes
    -bestpathlw 9.5 9.500000e+00
    -ceplen 13 13
    -cmn current current
    -cmninit 8.0 40
    -compallsen no no
    -debug 0
    -dict /var/mobile/Containers/Data/Application/2DD76771-D5C9-4303-AE69-7DC22AB5849E/Library/Caches/FirstGrammarModel.dic
    -dictcase no no
    -dither no no
    -doublebw no no
    -ds 1 1
    -fdict /var/containers/Bundle/Application/A9C4195E-904F-44F1-906A-F193DC56484D/OpenEarsTest.app/AcousticModelEnglish.bundle/noisedict
    -feat 1s_c_d_dd 1s_c_d_dd
    -featparams /var/containers/Bundle/Application/A9C4195E-904F-44F1-906A-F193DC56484D/OpenEarsTest.app/AcousticModelEnglish.bundle/feat.params
    -fillprob 1e-8 1.000000e-08
    -frate 100 100
    -fsg
    -fsgusealtpron yes yes
    -fsgusefiller yes yes
    -fwdflat yes yes
    -fwdflatbeam 1e-64 1.000000e-64
    -fwdflatefwid 4 4
    -fwdflatlw 8.5 8.500000e+00
    -fwdflatsfwin 25 25
    -fwdflatwbeam 7e-29 7.000000e-29
    -fwdtree yes yes
    -hmm /var/containers/Bundle/Application/A9C4195E-904F-44F1-906A-F193DC56484D/OpenEarsTest.app/AcousticModelEnglish.bundle
    -input_endian little little
    -jsgf /var/mobile/Containers/Data/Application/2DD76771-D5C9-4303-AE69-7DC22AB5849E/Library/Caches/FirstGrammarModel.gram
    -keyphrase
    -kws
    -kws_delay 10 10
    -kws_plp 1e-1 1.000000e-01
    -kws_threshold 1 1.000000e+00
    -latsize 5000 5000
    -lda
    -ldadim 0 0
    -lifter 0 22
    -lm
    -lmctl
    -lmname
    -logbase 1.0001 1.000100e+00
    -logfn
    -logspec no no
    -lowerf 133.33334 1.300000e+02
    -lpbeam 1e-40 1.000000e-40
    -lponlybeam 7e-29 7.000000e-29
    -lw 6.5 1.000000e+00
    -maxhmmpf 30000 30000
    -maxwpf -1 -1
    -mdef /var/containers/Bundle/Application/A9C4195E-904F-44F1-906A-F193DC56484D/OpenEarsTest.app/AcousticModelEnglish.bundle/mdef
    -mean /var/containers/Bundle/Application/A9C4195E-904F-44F1-906A-F193DC56484D/OpenEarsTest.app/AcousticModelEnglish.bundle/means
    -mfclogdir
    -min_endfr 0 0
    -mixw
    -mixwfloor 0.0000001 1.000000e-07
    -mllr
    -mmap yes yes
    -ncep 13 13
    -nfft 512 512
    -nfilt 40 25
    -nwpen 1.0 1.000000e+00
    -pbeam 1e-48 1.000000e-48
    -pip 1.0 1.000000e+00
    -pl_beam 1e-10 1.000000e-10
    -pl_pbeam 1e-10 1.000000e-10
    -pl_pip 1.0 1.000000e+00
    -pl_weight 3.0 3.000000e+00
    -pl_window 5 5
    -rawlogdir
    -remove_dc no no
    -remove_noise yes yes
    -remove_silence yes yes
    -round_filters yes yes
    -samprate 16000 1.600000e+04
    -seed -1 -1
    -sendump /var/containers/Bundle/Application/A9C4195E-904F-44F1-906A-F193DC56484D/OpenEarsTest.app/AcousticModelEnglish.bundle/sendump
    -senlogdir
    -senmgau
    -silprob 0.005 5.000000e-03
    -smoothspec no no
    -svspec 0-12/13-25/26-38
    -tmat /var/containers/Bundle/Application/A9C4195E-904F-44F1-906A-F193DC56484D/OpenEarsTest.app/AcousticModelEnglish.bundle/transition_matrices
    -tmatfloor 0.0001 1.000000e-04
    -topn 4 4
    -topn_beam 0 0
    -toprule
    -transform legacy dct
    -unit_area yes yes
    -upperf 6855.4976 6.800000e+03
    -uw 1.0 1.000000e+00
    -vad_postspeech 50 69
    -vad_prespeech 20 10
    -vad_startspeech 10 10
    -vad_threshold 2.0 2.300000e+00
    -var /var/containers/Bundle/Application/A9C4195E-904F-44F1-906A-F193DC56484D/OpenEarsTest.app/AcousticModelEnglish.bundle/variances
    -varfloor 0.0001 1.000000e-04
    -varnorm no no
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -wbeam 7e-29 7.000000e-29
    -wip 0.65 6.500000e-01
    -wlen 0.025625 2.562500e-02

    INFO: feat.c(715): Initializing feature stream to type: ‘1s_c_d_dd’, ceplen=13, CMN=’current’, VARNORM=’no’, AGC=’none’
    INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
    INFO: acmod.c(164): Using subvector specification 0-12/13-25/26-38
    INFO: mdef.c(518): Reading model definition: /var/containers/Bundle/Application/A9C4195E-904F-44F1-906A-F193DC56484D/OpenEarsTest.app/AcousticModelEnglish.bundle/mdef
    INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
    INFO: bin_mdef.c(336): Reading binary model definition: /var/containers/Bundle/Application/A9C4195E-904F-44F1-906A-F193DC56484D/OpenEarsTest.app/AcousticModelEnglish.bundle/mdef
    INFO: bin_mdef.c(516): 46 CI-phone, 168344 CD-phone, 3 emitstate/phone, 138 CI-sen, 6138 Sen, 32881 Sen-Seq
    INFO: tmat.c(206): Reading HMM transition probability matrices: /var/containers/Bundle/Application/A9C4195E-904F-44F1-906A-F193DC56484D/OpenEarsTest.app/AcousticModelEnglish.bundle/transition_matrices
    INFO: acmod.c(117): Attempting to use PTM computation module
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /var/containers/Bundle/Application/A9C4195E-904F-44F1-906A-F193DC56484D/OpenEarsTest.app/AcousticModelEnglish.bundle/means
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
    INFO: ms_gauden.c(294): 512×13
    INFO: ms_gauden.c(294): 512×13
    INFO: ms_gauden.c(294): 512×13
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /var/containers/Bundle/Application/A9C4195E-904F-44F1-906A-F193DC56484D/OpenEarsTest.app/AcousticModelEnglish.bundle/variances
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
    INFO: ms_gauden.c(294): 512×13
    INFO: ms_gauden.c(294): 512×13
    INFO: ms_gauden.c(294): 512×13
    INFO: ms_gauden.c(354): 0 variance values floored
    INFO: ptm_mgau.c(805): Number of codebooks doesn’t match number of ciphones, doesn’t look like PTM: 1 != 46
    INFO: acmod.c(119): Attempting to use semi-continuous computation module
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /var/containers/Bundle/Application/A9C4195E-904F-44F1-906A-F193DC56484D/OpenEarsTest.app/AcousticModelEnglish.bundle/means
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
    INFO: ms_gauden.c(294): 512×13
    INFO: ms_gauden.c(294): 512×13
    INFO: ms_gauden.c(294): 512×13
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /var/containers/Bundle/Application/A9C4195E-904F-44F1-906A-F193DC56484D/OpenEarsTest.app/AcousticModelEnglish.bundle/variances
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
    INFO: ms_gauden.c(294): 512×13
    INFO: ms_gauden.c(294): 512×13
    INFO: ms_gauden.c(294): 512×13
    INFO: ms_gauden.c(354): 0 variance values floored
    INFO: s2_semi_mgau.c(904): Loading senones from dump file /var/containers/Bundle/Application/A9C4195E-904F-44F1-906A-F193DC56484D/OpenEarsTest.app/AcousticModelEnglish.bundle/sendump
    INFO: s2_semi_mgau.c(928): BEGIN FILE FORMAT DESCRIPTION
    INFO: s2_semi_mgau.c(991): Rows: 512, Columns: 6138
    INFO: s2_semi_mgau.c(1023): Using memory-mapped I/O for senones
    INFO: s2_semi_mgau.c(1294): Maximum top-N: 4 Top-N beams: 0 0 0
    INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
    INFO: dict.c(320): Allocating 4119 * 32 bytes (128 KiB) for word entries
    INFO: dict.c(333): Reading main dictionary: /var/mobile/Containers/Data/Application/2DD76771-D5C9-4303-AE69-7DC22AB5849E/Library/Caches/FirstGrammarModel.dic
    INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(336): 14 words read
    INFO: dict.c(358): Reading filler dictionary: /var/containers/Bundle/Application/A9C4195E-904F-44F1-906A-F193DC56484D/OpenEarsTest.app/AcousticModelEnglish.bundle/noisedict
    INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(361): 9 words read
    INFO: dict2pid.c(396): Building PID tables for dictionary
    INFO: dict2pid.c(406): Allocating 46^3 * 2 bytes (190 KiB) for word-initial triphones
    INFO: dict2pid.c(132): Allocated 51152 bytes (49 KiB) for word-final triphones
    INFO: dict2pid.c(196): Allocated 51152 bytes (49 KiB) for single-phone word triphones
    INFO: jsgf.c(691): Defined rule: <FirstGrammarModel.g00000>
    INFO: jsgf.c(691): Defined rule: PUBLIC <FirstGrammarModel.rule_0>
    INFO: fsg_model.c(215): Computing transitive closure for null transitions
    INFO: fsg_model.c(277): 0 null transitions added
    INFO: fsg_search.c(227): FSG(beam: -1080, pbeam: -1080, wbeam: -634; wip: -5, pip: 0)
    INFO: fsg_model.c(428): Adding silence transitions for <sil> to FSG
    INFO: fsg_model.c(448): Added 9 silence word transitions
    INFO: fsg_model.c(428): Adding silence transitions for <sil> to FSG
    INFO: fsg_model.c(448): Added 9 silence word transitions
    INFO: fsg_model.c(428): Adding silence transitions for [BREATH] to FSG
    INFO: fsg_model.c(448): Added 9 silence word transitions
    INFO: fsg_model.c(428): Adding silence transitions for [COUGH] to FSG
    INFO: fsg_model.c(448): Added 9 silence word transitions
    INFO: fsg_model.c(428): Adding silence transitions for [NOISE] to FSG
    INFO: fsg_model.c(448): Added 9 silence word transitions
    INFO: fsg_model.c(428): Adding silence transitions for [SMACK] to FSG
    INFO: fsg_model.c(448): Added 9 silence word transitions
    INFO: fsg_model.c(428): Adding silence transitions for [UH] to FSG
    INFO: fsg_model.c(448): Added 9 silence word transitions
    INFO: fsg_search.c(173): Added 4 alternate word transitions
    INFO: fsg_lextree.c(110): Allocated 846 bytes (0 KiB) for left and right context phones
    INFO: fsg_lextree.c(256): 134 HMM nodes in lextree (81 leaves)
    INFO: fsg_lextree.c(259): Allocated 19296 bytes (18 KiB) for all lextree nodes
    INFO: fsg_lextree.c(262): Allocated 11664 bytes (11 KiB) for lextree leafnodes
    2018-01-03 18:40:16.733045+0200 OpenEarsTest[2672:954626] There is no CMN plist so we are using the fresh CMN value 40.000000.
    2018-01-03 18:40:16.733680+0200 OpenEarsTest[2672:954626] Listening.
    2018-01-03 18:40:16.734432+0200 OpenEarsTest[2672:954626] Project has these words or phrases in its dictionary:
    CROCODILE
    EMERGENCY
    EMERGENCY(2)
    GREEN
    HELLO
    HELLO(2)
    HEY
    NERD
    PEOPLE
    ROBOT
    ROBOT(2)
    SITUATION
    THERE
    YOU
    2018-01-03 18:40:16.734542+0200 OpenEarsTest[2672:954626] Recognition loop has started
    2018-01-03 18:40:16.734957+0200 OpenEarsTest[2672:954570] Successfully started listening session from startListeningWithLanguageModelAtPath:
    2018-01-03 18:40:16.752577+0200 OpenEarsTest[2672:954570] Pocketsphinx is now listening.
    2018-01-03 18:40:17.022659+0200 OpenEarsTest[2672:954626] Speech detected…
    2018-01-03 18:40:17.023302+0200 OpenEarsTest[2672:954570] Pocketsphinx has detected speech.
    2018-01-03 18:40:18.126128+0200 OpenEarsTest[2672:954626] End of speech detected…
    2018-01-03 18:40:18.127059+0200 OpenEarsTest[2672:954570] Pocketsphinx has detected a period of silence, concluding an utterance.
    INFO: cmn_prior.c(131): cmn_prior_update: from < 40.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >
    INFO: cmn_prior.c(149): cmn_prior_update: to < 45.87 15.93 -4.69 -5.94 -3.90 -3.12 -2.24 1.58 -1.77 -1.60 -4.82 -5.68 0.49 >
    INFO: fsg_search.c(843): 115 frames, 5325 HMMs (46/fr), 10570 senones (91/fr), 439 history entries (3/fr)

    ERROR: “fsg_search.c”, line 913: Final result does not match the grammar in frame 115
    2018-01-03 18:40:18.130912+0200 OpenEarsTest[2672:954626] Pocketsphinx heard “” with a score of (0) and an utterance ID of 0.
    2018-01-03 18:40:18.131202+0200 OpenEarsTest[2672:954626] Hypothesis was null so we aren’t returning it. If you want null hypotheses to also be returned, set OEPocketsphinxController’s property returnNullHypotheses to TRUE before starting OEPocketsphinxController.
    2018-01-03 18:40:20.958072+0200 OpenEarsTest[2672:954624] Speech detected…
    2018-01-03 18:40:20.959169+0200 OpenEarsTest[2672:954570] Pocketsphinx has detected speech.
    2018-01-03 18:40:22.609818+0200 OpenEarsTest[2672:954626] End of speech detected…
    2018-01-03 18:40:22.610542+0200 OpenEarsTest[2672:954570] Pocketsphinx has detected a period of silence, concluding an utterance.
    INFO: cmn_prior.c(131): cmn_prior_update: from < 45.87 15.93 -4.69 -5.94 -3.90 -3.12 -2.24 1.58 -1.77 -1.60 -4.82 -5.68 0.49 >
    INFO: cmn_prior.c(149): cmn_prior_update: to < 49.01 10.38 1.63 1.36 2.62 -2.81 -8.39 1.88 -3.63 -1.06 -3.82 -2.39 -1.53 >
    INFO: fsg_search.c(843): 167 frames, 11050 HMMs (66/fr), 17759 senones (106/fr), 1380 history entries (8/fr)

    2018-01-03 18:40:22.614643+0200 OpenEarsTest[2672:954626] Pocketsphinx heard “HEY THERE” with a score of (0) and an utterance ID of 1.
    2018-01-03 18:40:22.634868+0200 OpenEarsTest[2672:954570] The received hypothesis is HEY THERE with a score of 0 and an ID of 1
    2018-01-03 18:40:25.315471+0200 OpenEarsTest[2672:954624] Speech detected…
    2018-01-03 18:40:25.316027+0200 OpenEarsTest[2672:954570] Pocketsphinx has detected speech.
    2018-01-03 18:40:27.083584+0200 OpenEarsTest[2672:954627] End of speech detected…
    2018-01-03 18:40:27.084296+0200 OpenEarsTest[2672:954570] Pocketsphinx has detected a period of silence, concluding an utterance.
    INFO: cmn_prior.c(131): cmn_prior_update: from < 49.01 10.38 1.63 1.36 2.62 -2.81 -8.39 1.88 -3.63 -1.06 -3.82 -2.39 -1.53 >
    INFO: cmn_prior.c(149): cmn_prior_update: to < 50.11 9.56 2.45 2.06 4.52 -5.13 -9.10 1.93 -4.76 -0.55 -3.99 -0.50 -2.36 >
    INFO: fsg_search.c(843): 189 frames, 14014 HMMs (74/fr), 21753 senones (115/fr), 1671 history entries (8/fr)

    2018-01-03 18:40:27.087052+0200 OpenEarsTest[2672:954627] Pocketsphinx heard “HEY THERE” with a score of (0) and an utterance ID of 2.
    2018-01-03 18:40:27.093560+0200 OpenEarsTest[2672:954570] The received hypothesis is HEY THERE with a score of 0 and an ID of 2
    2018-01-03 18:40:29.268660+0200 OpenEarsTest[2672:954625] Speech detected…
    2018-01-03 18:40:29.269468+0200 OpenEarsTest[2672:954570] Pocketsphinx has detected speech.
    2018-01-03 18:40:31.318108+0200 OpenEarsTest[2672:954625] End of speech detected…
    2018-01-03 18:40:31.319683+0200 OpenEarsTest[2672:954570] Pocketsphinx has detected a period of silence, concluding an utterance.
    INFO: cmn_prior.c(131): cmn_prior_update: from < 50.11 9.56 2.45 2.06 4.52 -5.13 -9.10 1.93 -4.76 -0.55 -3.99 -0.50 -2.36 >
    INFO: cmn_prior.c(149): cmn_prior_update: to < 49.19 10.69 3.28 1.81 3.41 -5.30 -9.70 2.74 -3.79 -1.62 -3.66 -0.23 -1.29 >
    INFO: fsg_search.c(843): 215 frames, 4460 HMMs (20/fr), 9700 senones (45/fr), 587 history entries (2/fr)

    2018-01-03 18:40:31.321215+0200 OpenEarsTest[2672:954625] Pocketsphinx heard “HELLO PEOPLE” with a score of (0) and an utterance ID of 3.
    2018-01-03 18:40:31.326913+0200 OpenEarsTest[2672:954570] The received hypothesis is HELLO PEOPLE with a score of 0 and an ID of 3
    2018-01-03 18:40:32.592207+0200 OpenEarsTest[2672:954627] Speech detected…
    2018-01-03 18:40:32.592810+0200 OpenEarsTest[2672:954570] Pocketsphinx has detected speech.
    INFO: cmn_prior.c(99): cmn_prior_update: from < 49.19 10.69 3.28 1.81 3.41 -5.30 -9.70 2.74 -3.79 -1.62 -3.66 -0.23 -1.29 >
    INFO: cmn_prior.c(116): cmn_prior_update: to < 50.16 10.24 3.44 2.45 4.35 -5.52 -9.80 2.17 -4.13 -1.48 -3.39 -0.10 -1.88 >
    2018-01-03 18:40:34.129945+0200 OpenEarsTest[2672:954627] End of speech detected…
    2018-01-03 18:40:34.130669+0200 OpenEarsTest[2672:954570] Pocketsphinx has detected a period of silence, concluding an utterance.
    INFO: cmn_prior.c(131): cmn_prior_update: from < 50.16 10.24 3.44 2.45 4.35 -5.52 -9.80 2.17 -4.13 -1.48 -3.39 -0.10 -1.88 >
    INFO: cmn_prior.c(149): cmn_prior_update: to < 48.14 10.65 3.09 2.29 3.56 -5.10 -9.35 2.25 -3.69 -1.62 -2.82 -0.10 -1.23 >
    INFO: fsg_search.c(843): 154 frames, 7536 HMMs (48/fr), 13619 senones (88/fr), 653 history entries (4/fr)

    2018-01-03 18:40:34.133792+0200 OpenEarsTest[2672:954627] Pocketsphinx heard “HEY THERE” with a score of (0) and an utterance ID of 4.
    2018-01-03 18:40:34.138785+0200 OpenEarsTest[2672:954570] The received hypothesis is HEY THERE with a score of 0 and an ID of 4

    #1032191
    Halle Winkler
    Politepix

    Thanks for the logging. This is a bit unusual in my experience so I’m trying to pin down whether there are any contributing factors, pardon my questions. How close is your implementation to the sample app which ships with the distribution? Do you get the same results when just altering the sample app to support this grammar? Is there anything about the environment (or I guess even the speaker) which could contribute to the results here?

    #1032204
    olegnaumenko
    Participant

    Thank You for the reply.
    Everything is standard as in example, except, for this log i changed mode to grammar and supplied several phrases I would like it to recognize (as in top of the log). XCode 9.2, iPhone 5s. I am not native English speaker but I speak it not bad. And the point of the experiment is to only fire when the proper phrase is said, in proper English.

Viewing 21 posts - 1 through 21 (of 21 total)
  • You must be logged in to reply to this topic.