Multiple "Audio route has changed" messages?

Home Forums OpenEars Multiple "Audio route has changed" messages?

  • This topic has 11 replies, 2 voices, and was last updated 9 years ago by rikk.
Viewing 12 posts - 1 through 12 (of 12 total)

  • Author
    Posts
  • #1025421
    rikk
    Participant

    (OpenEars 2.03, Obj-C/iPhone app)

    My apps is quite simple (based on the example code in your tutorial):
    – A button to startListening for n seconds.
    – OE is correctly identifying words and reporting.
    – At the end of the time period, stopListening.
    – User can repeat this as many times as they wish.

    Problems:
    1. I see many “Audio route has changed” messages in the logs. Seems weird.

    2. ERROR: [AVAudioSession Notify Thread] AVAudioSessionPortImpl.mm:52: ValidateRequiredFields: Unknown selected data source for Port iPhone Microphone (type: MicrophoneBuiltIn)

    3. After first start/stop session is complete and user taps start again, I get LOTS of warnings that OE is already listening (even though I verify that stopListening executed properly).

    This session below shows ONE start/stop session.

    Thanks!
    Rikk

    ———————————
    2015-04-15 12:52:20.562 Dict Shun[7247:2374238] DEBUG> Language model SUCCESSFULLY generated, for keywords: (
    DARN,
    DRAT,
    BUMMER,
    DAMN
    )!
    2015-04-15 12:52:20.563 Dict Shun[7247:2374238] DEBUG> Language model: path name to lang model file: /var/mobile/Containers/Data/Application/F3289E03-65B0-4A5C-84DC-89FEEDF92638/Library/Caches/MyLanguageModelFiles.DMP!
    2015-04-15 12:52:20.564 Dict Shun[7247:2374238] DEBUG> Language model: path name to lang model dict: /var/mobile/Containers/Data/Application/F3289E03-65B0-4A5C-84DC-89FEEDF92638/Library/Caches/MyLanguageModelFiles.dic!
    2015-04-15 12:52:20.665 Dict Shun[7247:2374238] Starting OpenEars logging for OpenEars version 2.03 on 64-bit device (or build): iPhone running iOS version: 8.100000
    2015-04-15 12:52:37.133 Dict Shun[7247:2374238] DEBUG> User has tapped start button ==> START LISTENING!
    2015-04-15 12:52:42.259 Dict Shun[7247:2374238] Attempting to start listening session from startListeningWithLanguageModelAtPath:
    2015-04-15 12:52:42.259 Dict Shun[7247:2374238] User gave mic permission for this app.
    2015-04-15 12:52:42.260 Dict Shun[7247:2374238] setSecondsOfSilence wasn’t set, using default of 0.700000.
    2015-04-15 12:52:42.260 Dict Shun[7247:2374238] Successfully started listening session from startListeningWithLanguageModelAtPath:
    2015-04-15 12:52:42.260 Dict Shun[7247:2374410] Starting listening.
    2015-04-15 12:52:42.260 Dict Shun[7247:2374410] about to set up audio session
    2015-04-15 12:52:42.467 Dict Shun[7247:2374301] Audio route has changed for the following reason:
    2015-04-15 12:52:42.472 Dict Shun[7247:2374301] There was a category change. The new category is AVAudioSessionCategoryPlayAndRecord
    2015-04-15 12:52:42.477 Dict Shun[7247:2374301] This is not a case in which OpenEars notifies of a route change. At the close of this function, the new audio route is —SpeakerMicrophoneBuiltIn—. The previous route before changing to this route was <AVAudioSessionRouteDescription: 0x174203ab0,
    inputs = (null);
    outputs = (
    “<AVAudioSessionPortDescription: 0x174203a80, type = Speaker; name = Speaker; UID = Speaker; selectedDataSource = (null)>”
    )>.
    2015-04-15 12:52:42.706 Dict Shun[7247:2374410] done starting audio unit
    INFO: cmd_ln.c(702): Parsing command line:
    \
    -lm /var/mobile/Containers/Data/Application/F3289E03-65B0-4A5C-84DC-89FEEDF92638/Library/Caches/MyLanguageModelFiles.DMP \
    -vad_prespeech 10 \
    -vad_postspeech 69 \
    -vad_threshold 2.000000 \
    -remove_noise yes \
    -remove_silence yes \
    -bestpath yes \
    -lw 6.500000 \
    -dict /var/mobile/Containers/Data/Application/F3289E03-65B0-4A5C-84DC-89FEEDF92638/Library/Caches/MyLanguageModelFiles.dic \
    -hmm /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundle

    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -agc none none
    -agcthresh 2.0 2.000000e+00
    -allphone
    -allphone_ci no no
    -alpha 0.97 9.700000e-01
    -argfile
    -ascale 20.0 2.000000e+01
    -aw 1 1
    -backtrace no no
    -beam 1e-48 1.000000e-48
    -bestpath yes yes
    -bestpathlw 9.5 9.500000e+00
    -bghist no no
    -ceplen 13 13
    -cmn current current
    -cmninit 8.0 8.0
    -compallsen no no
    -debug 0
    -dict /var/mobile/Containers/Data/Application/F3289E03-65B0-4A5C-84DC-89FEEDF92638/Library/Caches/MyLanguageModelFiles.dic
    -dictcase no no
    -dither no no
    -doublebw no no
    -ds 1 1
    -fdict
    -feat 1s_c_d_dd 1s_c_d_dd
    -featparams
    -fillprob 1e-8 1.000000e-08
    -frate 100 100
    -fsg
    -fsgusealtpron yes yes
    -fsgusefiller yes yes
    -fwdflat yes yes
    -fwdflatbeam 1e-64 1.000000e-64
    -fwdflatefwid 4 4
    -fwdflatlw 8.5 8.500000e+00
    -fwdflatsfwin 25 25
    -fwdflatwbeam 7e-29 7.000000e-29
    -fwdtree yes yes
    -hmm /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundle
    -input_endian little little
    -jsgf
    -kdmaxbbi -1 -1
    -kdmaxdepth 0 0
    -kdtree
    -keyphrase
    -kws
    -kws_plp 1e-1 1.000000e-01
    -kws_threshold 1 1.000000e+00
    -latsize 5000 5000
    -lda
    -ldadim 0 0
    -lextreedump 0 0
    -lifter 0 0
    -lm /var/mobile/Containers/Data/Application/F3289E03-65B0-4A5C-84DC-89FEEDF92638/Library/Caches/MyLanguageModelFiles.DMP
    -lmctl
    -lmname
    -logbase 1.0001 1.000100e+00
    -logfn
    -logspec no no
    -lowerf 133.33334 1.333333e+02
    -lpbeam 1e-40 1.000000e-40
    -lponlybeam 7e-29 7.000000e-29
    -lw 6.5 6.500000e+00
    -maxhmmpf 10000 10000
    -maxnewoov 20 20
    -maxwpf -1 -1
    -mdef
    -mean
    -mfclogdir
    -min_endfr 0 0
    -mixw
    -mixwfloor 0.0000001 1.000000e-07
    -mllr
    -mmap yes yes
    -ncep 13 13
    -nfft 512 512
    -nfilt 40 40
    -nwpen 1.0 1.000000e+00
    -pbeam 1e-48 1.000000e-48
    -pip 1.0 1.000000e+00
    -pl_beam 1e-10 1.000000e-10
    -pl_pbeam 1e-5 1.000000e-05
    -pl_window 0 0
    -rawlogdir
    -remove_dc no no
    -remove_noise yes yes
    -remove_silence yes yes
    -round_filters yes yes
    -samprate 16000 1.600000e+04
    -seed -1 -1
    -sendump
    -senlogdir
    -senmgau
    -silprob 0.005 5.000000e-03
    -smoothspec no no
    -svspec
    -tmat
    -tmatfloor 0.0001 1.000000e-04
    -topn 4 4
    -topn_beam 0 0
    -toprule
    -transform legacy legacy
    -unit_area yes yes
    -upperf 6855.4976 6.855498e+03
    -usewdphones no no
    -uw 1.0 1.000000e+00
    -vad_postspeech 50 69
    -vad_prespeech 10 10
    -vad_threshold 2.0 2.000000e+00
    -var
    -varfloor 0.0001 1.000000e-04
    -varnorm no no
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -wbeam 7e-29 7.000000e-29
    -wip 0.65 6.500000e-01
    -wlen 0.025625 2.562500e-02

    INFO: cmd_ln.c(702): Parsing command line:
    \
    -nfilt 25 \
    -lowerf 130 \
    -upperf 6800 \
    -feat 1s_c_d_dd \
    -svspec 0-12/13-25/26-38 \
    -agc none \
    -cmn current \
    -varnorm no \
    -transform dct \
    -lifter 22 \
    -cmninit 40

    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -agc none none
    -agcthresh 2.0 2.000000e+00
    -alpha 0.97 9.700000e-01
    -ceplen 13 13
    -cmn current current
    -cmninit 8.0 40
    -dither no no
    -doublebw no no
    -feat 1s_c_d_dd 1s_c_d_dd
    -frate 100 100
    -input_endian little little
    -lda
    -ldadim 0 0
    -lifter 0 22
    -logspec no no
    -lowerf 133.33334 1.300000e+02
    -ncep 13 13
    -nfft 512 512
    -nfilt 40 25
    -remove_dc no no
    -remove_noise yes yes
    -remove_silence yes yes
    -round_filters yes yes
    -samprate 16000 1.600000e+04
    -seed -1 -1
    -smoothspec no no
    -svspec 0-12/13-25/26-38
    -transform legacy dct
    -unit_area yes yes
    -upperf 6855.4976 6.800000e+03
    -vad_postspeech 50 69
    -vad_prespeech 10 10
    -vad_threshold 2.0 2.000000e+00
    -varnorm no no
    -verbose no no
    -warp_params
    -warp_type inverse_linear inverse_linear
    -wlen 0.025625 2.562500e-02

    INFO: acmod.c(252): Parsed model-specific feature parameters from /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundle/feat.params
    INFO: feat.c(715): Initializing feature stream to type: ‘1s_c_d_dd’, ceplen=13, CMN=’current’, VARNORM=’no’, AGC=’none’
    INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
    INFO: acmod.c(171): Using subvector specification 0-12/13-25/26-38
    INFO: mdef.c(518): Reading model definition: /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundle/mdef
    INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
    INFO: bin_mdef.c(336): Reading binary model definition: /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundle/mdef
    INFO: bin_mdef.c(516): 46 CI-phone, 168344 CD-phone, 3 emitstate/phone, 138 CI-sen, 6138 Sen, 32881 Sen-Seq
    INFO: tmat.c(206): Reading HMM transition probability matrices: /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundle/transition_matrices
    INFO: acmod.c(124): Attempting to use SCHMM computation module
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundle/means
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
    INFO: ms_gauden.c(294): 512×13
    INFO: ms_gauden.c(294): 512×13
    INFO: ms_gauden.c(294): 512×13
    INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundle/variances
    INFO: ms_gauden.c(292): 1 codebook, 3 feature, size:
    INFO: ms_gauden.c(294): 512×13
    INFO: ms_gauden.c(294): 512×13
    INFO: ms_gauden.c(294): 512×13
    INFO: ms_gauden.c(354): 0 variance values floored
    INFO: s2_semi_mgau.c(904): Loading senones from dump file /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundle/sendump
    INFO: s2_semi_mgau.c(928): BEGIN FILE FORMAT DESCRIPTION
    INFO: s2_semi_mgau.c(991): Rows: 512, Columns: 6138
    INFO: s2_semi_mgau.c(1023): Using memory-mapped I/O for senones
    INFO: s2_semi_mgau.c(1294): Maximum top-N: 4 Top-N beams: 0 0 0
    INFO: dict.c(320): Allocating 4109 * 32 bytes (128 KiB) for word entries
    INFO: dict.c(333): Reading main dictionary: /var/mobile/Containers/Data/Application/F3289E03-65B0-4A5C-84DC-89FEEDF92638/Library/Caches/MyLanguageModelFiles.dic
    INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(336): 4 words read
    INFO: dict.c(342): Reading filler dictionary: /private/var/mobile/Containers/Bundle/Application/EDCA027A-F296-4F2A-97E2-EE3EA8332768/Dict Shun.app/AcousticModelEnglish.bundle/noisedict
    INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
    INFO: dict.c(345): 9 words read
    INFO: dict2pid.c(396): Building PID tables for dictionary
    INFO: dict2pid.c(406): Allocating 46^3 * 2 bytes (190 KiB) for word-initial triphones
    INFO: dict2pid.c(132): Allocated 51152 bytes (49 KiB) for word-final triphones
    INFO: dict2pid.c(196): Allocated 51152 bytes (49 KiB) for single-phone word triphones
    INFO: ngram_model_arpa.c(79): No \data\ mark in LM file
    INFO: ngram_model_dmp.c(166): Will use memory-mapped I/O for LM file
    INFO: ngram_model_dmp.c(220): ngrams 1=6, 2=8, 3=4
    INFO: ngram_model_dmp.c(266): 6 = LM.unigrams(+trailer) read
    INFO: ngram_model_dmp.c(312): 8 = LM.bigrams(+trailer) read
    INFO: ngram_model_dmp.c(338): 4 = LM.trigrams read
    INFO: ngram_model_dmp.c(363): 3 = LM.prob2 entries read
    INFO: ngram_model_dmp.c(383): 3 = LM.bo_wt2 entries read
    INFO: ngram_model_dmp.c(403): 2 = LM.prob3 entries read
    INFO: ngram_model_dmp.c(431): 1 = LM.tseg_base entries read
    INFO: ngram_model_dmp.c(487): 6 = ascii word strings read
    INFO: ngram_search_fwdtree.c(99): 4 unique initial diphones
    INFO: ngram_search_fwdtree.c(148): 0 root, 0 non-root channels, 10 single-phone words
    INFO: ngram_search_fwdtree.c(186): Creating search tree
    INFO: ngram_search_fwdtree.c(192): before: 0 root, 0 non-root channels, 10 single-phone words
    INFO: ngram_search_fwdtree.c(326): after: max nonroot chan increased to 135
    INFO: ngram_search_fwdtree.c(339): after: 4 root, 7 non-root channels, 9 single-phone words
    INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
    2015-04-15 12:52:42.783 Dict Shun[7247:2374410] Restoring SmartCMN value of 38.984131
    2015-04-15 12:52:42.783 Dict Shun[7247:2374410] Listening.
    2015-04-15 12:52:42.784 Dict Shun[7247:2374410] Project has these words or phrases in its dictionary:
    BUMMER
    DAMN
    DARN
    DRAT
    2015-04-15 12:52:42.784 Dict Shun[7247:2374410] Recognition loop has started
    2015-04-15 12:52:42.785 Dict Shun[7247:2374238] Pocketsphinx is now listening.
    2015-04-15 12:52:42.886 Dict Shun[7247:2374238] A request has been made to start a listening session using startListeningWithLanguageModelAtPath:dictionaryAtPath:acousticModelAtPath:languageModelIsJSGF:, however, there is already a listening session in progress which has not been stopped. Please stop this listening session first with [[OEPocketsphinxController sharedInstance] stopListening]; and wait to receive the OEEventsObserver callback pocketsphinxDidStopListening before starting a new session. You can still change models in the existing session by using OEPocketsphinxController’s method changeLanguageModelToFile:withDictionary:
    2015-04-15 12:52:42.987 Dict Shun[7247:2374238] A request has been made to start a listening session using startListeningWithLanguageModelAtPath:dictionaryAtPath:acousticModelAtPath:languageModelIsJSGF:, however, there is already a listening session in progress which has not been stopped. Please stop this listening session first with [[OEPocketsphinxController sharedInstance] stopListening]; and wait to receive the OEEventsObserver callback pocketsphinxDidStopListening before starting a new session. You can still change models in the existing session by using OEPocketsphinxController’s method changeLanguageModelToFile:withDictionary:
    2015-04-15 12:52:43.089 Dict Shun[7247:2374238] A request has been made to start a listening session using startListeningWithLanguageModelAtPath:dictionaryAtPath:acousticModelAtPath:languageModelIsJSGF:, however, there is already a listening session in progress which has not been stopped. Please stop this listening session first with [[OEPocketsphinxController sharedInstance] stopListening]; and wait to receive the OEEventsObserver callback pocketsphinxDidStopListening before starting a new session. You can still change models in the existing session by using OEPocketsphinxController’s method changeLanguageModelToFile:withDictionary:
    2015-04-15 12:52:43.912 Dict Shun[7247:2374353] Speech detected…
    2015-04-15 12:52:43.912 Dict Shun[7247:2374238] Pocketsphinx has detected speech.
    2015-04-15 12:52:44.886 Dict Shun[7247:2374353] End of speech detected…
    2015-04-15 12:52:44.887 Dict Shun[7247:2374238] Pocketsphinx has detected a period of silence, concluding an utterance.
    INFO: cmn_prior.c(131): cmn_prior_update: from < 38.98 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >
    INFO: cmn_prior.c(149): cmn_prior_update: to < 44.36 16.43 -17.45 -4.13 -1.19 6.09 -1.74 3.96 -0.70 -0.51 0.38 3.79 -5.49 >
    INFO: ngram_search_fwdtree.c(1550): 735 words recognized (7/fr)
    INFO: ngram_search_fwdtree.c(1552): 8165 senones evaluated (79/fr)
    INFO: ngram_search_fwdtree.c(1556): 3320 channels searched (31/fr), 400 1st, 2328 last
    INFO: ngram_search_fwdtree.c(1559): 881 words for which last channels evaluated (8/fr)
    INFO: ngram_search_fwdtree.c(1561): 130 candidate words for entering last phone (1/fr)
    INFO: ngram_search_fwdtree.c(1564): fwdtree 0.07 CPU 0.072 xRT
    INFO: ngram_search_fwdtree.c(1567): fwdtree 1.80 wall 1.729 xRT
    INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 2 words
    INFO: ngram_search_fwdflat.c(938): 611 words recognized (6/fr)
    INFO: ngram_search_fwdflat.c(940): 6232 senones evaluated (60/fr)
    INFO: ngram_search_fwdflat.c(942): 2397 channels searched (23/fr)
    INFO: ngram_search_fwdflat.c(944): 814 words searched (7/fr)
    INFO: ngram_search_fwdflat.c(947): 55 word transitions (0/fr)
    INFO: ngram_search_fwdflat.c(950): fwdflat 0.02 CPU 0.023 xRT
    INFO: ngram_search_fwdflat.c(953): fwdflat 0.03 wall 0.026 xRT
    INFO: ngram_search.c(1215): </s> not found in last frame, using [SMACK].102 instead
    INFO: ngram_search.c(1268): lattice start node <s>.0 end node [SMACK].60
    INFO: ngram_search.c(1294): Eliminated 132 nodes before end node
    INFO: ngram_search.c(1399): Lattice has 269 nodes, 524 links
    INFO: ps_lattice.c(1368): Normalizer P(O) = alpha([SMACK]:60:102) = -686508
    INFO: ps_lattice.c(1403): Joint P(O,S) = -686508 P(S|O) = 0
    INFO: ngram_search.c(890): bestpath 0.00 CPU 0.001 xRT
    INFO: ngram_search.c(893): bestpath 0.00 wall 0.002 xRT
    2015-04-15 12:52:44.917 Dict Shun[7247:2374353] Pocketsphinx heard “DARN” with a score of (0) and an utterance ID of 0.
    2015-04-15 12:52:44.918 Dict Shun[7247:2374238] The received hypothesis is DARN with a score of 0 and an ID of 0
    2015-04-15 12:52:45.796 Dict Shun[7247:2374410] Speech detected…
    2015-04-15 12:52:45.797 Dict Shun[7247:2374238] Pocketsphinx has detected speech.
    2015-04-15 12:52:46.803 Dict Shun[7247:2374410] End of speech detected…
    INFO: cmn_prior.c(131): cmn_prior_update: from < 44.36 16.43 -17.45 -4.13 -1.19 6.09 -1.74 3.96 -0.70 -0.51 0.38 3.79 -5.49 >
    INFO: cmn_prior.c(149): cmn_prior_update: to < 35.82 12.75 -8.41 0.75 2.41 2.39 -1.16 5.49 -2.92 -7.00 -3.80 0.23 -9.61 >
    INFO: ngram_search_fwdtree.c(1550): 912 words recognized (9/fr)
    2015-04-15 12:52:46.804 Dict Shun[7247:2374238] Pocketsphinx has detected a period of silence, concluding an utterance.
    INFO: ngram_search_fwdtree.c(1552): 6287 senones evaluated (59/fr)
    INFO: ngram_search_fwdtree.c(1556): 2033 channels searched (19/fr), 412 1st, 930 last
    INFO: ngram_search_fwdtree.c(1559): 930 words for which last channels evaluated (8/fr)
    INFO: ngram_search_fwdtree.c(1561): 128 candidate words for entering last phone (1/fr)
    INFO: ngram_search_fwdtree.c(1564): fwdtree 0.11 CPU 0.101 xRT
    INFO: ngram_search_fwdtree.c(1567): fwdtree 1.89 wall 1.763 xRT
    INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 2 words
    INFO: ngram_search_fwdflat.c(938): 719 words recognized (7/fr)
    INFO: ngram_search_fwdflat.c(940): 2190 senones evaluated (20/fr)
    INFO: ngram_search_fwdflat.c(942): 755 channels searched (7/fr)
    INFO: ngram_search_fwdflat.c(944): 755 words searched (7/fr)
    INFO: ngram_search_fwdflat.c(947): 76 word transitions (0/fr)
    INFO: ngram_search_fwdflat.c(950): fwdflat 0.02 CPU 0.020 xRT
    INFO: ngram_search_fwdflat.c(953): fwdflat 0.02 wall 0.021 xRT
    INFO: ngram_search.c(1215): </s> not found in last frame, using <sil>.105 instead
    INFO: ngram_search.c(1268): lattice start node <s>.0 end node <sil>.2
    INFO: ngram_search.c(1294): Eliminated 298 nodes before end node
    INFO: ngram_search.c(1399): Lattice has 300 nodes, 1 links
    INFO: ps_lattice.c(1368): Normalizer P(O) = alpha(<sil>:2:105) = -5320591
    INFO: ps_lattice.c(1403): Joint P(O,S) = -5320591 P(S|O) = 0
    INFO: ngram_search.c(890): bestpath 0.00 CPU 0.002 xRT
    INFO: ngram_search.c(893): bestpath 0.00 wall 0.001 xRT
    2015-04-15 12:52:46.829 Dict Shun[7247:2374410] Pocketsphinx heard “” with a score of (0) and an utterance ID of 1.
    2015-04-15 12:52:46.830 Dict Shun[7247:2374410] Hypothesis was null so we aren’t returning it. If you want null hypotheses to also be returned, set OEPocketsphinxController’s property returnNullHypotheses to TRUE before starting OEPocketsphinxController.
    2015-04-15 12:52:48.091 Dict Shun[7247:2374410] Speech detected…
    2015-04-15 12:52:48.092 Dict Shun[7247:2374238] Pocketsphinx has detected speech.
    2015-04-15 12:52:49.492 Dict Shun[7247:2374410] End of speech detected…
    INFO: cmn_prior.c(131): cmn_prior_update: from < 35.82 12.75 -8.41 0.75 2.41 2.39 -1.16 5.49 -2.92 -7.00 -3.80 0.23 -9.61 >
    INFO: cmn_prior.c(149): cmn_prior_update: to < 37.00 8.96 -8.76 5.71 0.96 0.59 -0.07 5.11 -2.17 -5.44 -4.62 0.82 -10.20 >
    INFO: ngram_search_fwdtree.c(1550): 1055 words recognized (7/fr)
    INFO: ngram_search_fwdtree.c(1552): 10067 senones evaluated (69/fr)
    2015-04-15 12:52:49.493 Dict Shun[7247:2374238] Pocketsphinx has detected a period of silence, concluding an utterance.
    INFO: ngram_search_fwdtree.c(1556): 3670 channels searched (25/fr), 564 1st, 2225 last
    INFO: ngram_search_fwdtree.c(1559): 1185 words for which last channels evaluated (8/fr)
    INFO: ngram_search_fwdtree.c(1561): 97 candidate words for entering last phone (0/fr)
    INFO: ngram_search_fwdtree.c(1564): fwdtree 0.12 CPU 0.086 xRT
    INFO: ngram_search_fwdtree.c(1567): fwdtree 2.66 wall 1.836 xRT
    INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 3 words
    INFO: ngram_search_fwdflat.c(938): 953 words recognized (7/fr)
    INFO: ngram_search_fwdflat.c(940): 6136 senones evaluated (42/fr)
    INFO: ngram_search_fwdflat.c(942): 2605 channels searched (17/fr)
    INFO: ngram_search_fwdflat.c(944): 1196 words searched (8/fr)
    INFO: ngram_search_fwdflat.c(947): 105 word transitions (0/fr)
    INFO: ngram_search_fwdflat.c(950): fwdflat 0.03 CPU 0.023 xRT
    INFO: ngram_search_fwdflat.c(953): fwdflat 0.04 wall 0.025 xRT
    INFO: ngram_search.c(1215): </s> not found in last frame, using <sil>.143 instead
    INFO: ngram_search.c(1268): lattice start node <s>.0 end node <sil>.79
    INFO: ngram_search.c(1294): Eliminated 248 nodes before end node
    INFO: ngram_search.c(1399): Lattice has 430 nodes, 873 links
    INFO: ps_lattice.c(1368): Normalizer P(O) = alpha(<sil>:79:143) = -928990
    INFO: ps_lattice.c(1403): Joint P(O,S) = -928990 P(S|O) = 0
    INFO: ngram_search.c(890): bestpath 0.00 CPU 0.003 xRT
    INFO: ngram_search.c(893): bestpath 0.01 wall 0.004 xRT
    2015-04-15 12:52:49.536 Dict Shun[7247:2374410] Pocketsphinx heard “DAMN” with a score of (0) and an utterance ID of 2.
    2015-04-15 12:52:49.537 Dict Shun[7247:2374238] The received hypothesis is DAMN with a score of 0 and an ID of 2
    2015-04-15 12:52:52.574 Dict Shun[7247:2374353] Speech detected…
    2015-04-15 12:52:52.575 Dict Shun[7247:2374238] Pocketsphinx has detected speech.
    INFO: cmn_prior.c(99): cmn_prior_update: from < 37.00 8.96 -8.76 5.71 0.96 0.59 -0.07 5.11 -2.17 -5.44 -4.62 0.82 -10.20 >
    INFO: cmn_prior.c(116): cmn_prior_update: to < 38.40 11.61 -11.22 -1.09 -4.36 2.96 0.25 0.91 -1.84 -3.39 -2.74 0.75 -10.08 >
    2015-04-15 12:52:57.157 Dict Shun[7247:2374238] DEBUG> Time over ==> STOP LISTENING!
    2015-04-15 12:52:57.158 Dict Shun[7247:2374238] Stopping listening.
    INFO: ngram_search_fwdtree.c(432): TOTAL fwdtree 0.31 CPU 0.087 xRT
    INFO: ngram_search_fwdtree.c(435): TOTAL fwdtree 6.35 wall 1.798 xRT
    INFO: ngram_search_fwdflat.c(176): TOTAL fwdflat 0.08 CPU 0.022 xRT
    INFO: ngram_search_fwdflat.c(179): TOTAL fwdflat 0.09 wall 0.024 xRT
    INFO: ngram_search.c(307): TOTAL bestpath 0.01 CPU 0.002 xRT
    INFO: ngram_search.c(310): TOTAL bestpath 0.01 wall 0.002 xRT
    2015-04-15 12:52:57.708 Dict Shun[7247:2374238] No longer listening.
    2015-04-15 12:52:57.709 Dict Shun[7247:2374238] DEBUG> Pocketxphinx stopListening = SUCCESSFUL!
    2015-04-15 12:52:57.709 Dict Shun[7247:2374238] DEBUG> Pocketsphinx has stopped listening.
    2015-04-15 12:52:57.724 Dict Shun[7247:2374301] Audio route has changed for the following reason:
    2015-04-15 12:52:57.725 Dict Shun[7247:2374301] There was a category change. The new category is AVAudioSessionCategoryPlayAndRecord
    2015-04-15 12:52:57.728 Dict Shun[7247:2374301] This is not a case in which OpenEars notifies of a route change. At the close of this function, the new audio route is —Speaker—. The previous route before changing to this route was <AVAudioSessionRouteDescription: 0x174203b30,
    inputs = (
    “<AVAudioSessionPortDescription: 0x174202a90, type = MicrophoneBuiltIn; name = iPhone Microphone; UID = Built-In Microphone; selectedDataSource = Bottom>”
    );
    outputs = (
    “<AVAudioSessionPortDescription: 0x1742034b0, type = Speaker; name = Speaker; UID = Speaker; selectedDataSource = (null)>”
    )>.
    2015-04-15 13:00:37.704 Dict Shun[7247:2374301] 13:00:37.703 ERROR: [AVAudioSession Notify Thread] AVAudioSessionPortImpl.mm:52: ValidateRequiredFields: Unknown selected data source for Port iPhone Microphone (type: MicrophoneBuiltIn)
    2015-04-15 13:00:37.705 Dict Shun[7247:2374301] Audio route has changed for the following reason:
    2015-04-15 13:00:37.708 Dict Shun[7247:2374301] There was a category change. The new category is AVAudioSessionCategoryPlayAndRecord
    2015-04-15 13:00:37.712 Dict Shun[7247:2374301] This is not a case in which OpenEars notifies of a route change. At the close of this function, the new audio route is —ReceiverMicrophoneBuiltIn—. The previous route before changing to this route was <AVAudioSessionRouteDescription: 0x1700177a0,
    inputs = (null);
    outputs = (
    “<AVAudioSessionPortDescription: 0x170017b30, type = Speaker; name = Speaker; UID = Speaker; selectedDataSource = (null)>”
    )>.
    2015-04-15 13:08:40.085 Dict Shun[7247:2374301] Audio route has changed for the following reason:
    2015-04-15 13:08:40.088 Dict Shun[7247:2374301] There was a category change. The new category is AVAudioSessionCategoryPlayAndRecord
    2015-04-15 13:08:40.091 Dict Shun[7247:2374301] This is not a case in which OpenEars notifies of a route change. At the close of this function, the new audio route is —Speaker—. The previous route before changing to this route was <AVAudioSessionRouteDescription: 0x174203a60,
    inputs = (
    “<AVAudioSessionPortDescription: 0x1742036d0, type = MicrophoneBuiltIn; name = iPhone Microphone; UID = Built-In Microphone; selectedDataSource = (null)>”
    );
    outputs = (
    “<AVAudioSessionPortDescription: 0x1742035d0, type = Receiver; name = Receiver; UID = Built-In Receiver; selectedDataSource = (null)>”
    )>.

    #1025422
    rikk
    Participant

    btw: I noticed that even in this single session I am seeing multiple:

    “A request has been made to start a listening session using startListeningWithLanguageModelAtPath:…, however, there is already a listening session in progress which has not been stopped. Please stop this listening session first ”

    … excerpt from orig log posted above…

    2015-04-15 12:52:42.784 Dict Shun[7247:2374410] Recognition loop has started
    2015-04-15 12:52:42.785 Dict Shun[7247:2374238] Pocketsphinx is now listening.
    2015-04-15 12:52:42.886 Dict Shun[7247:2374238] A request has been made to start a listening session using startListeningWithLanguageModelAtPath:dictionaryAtPath:acousticModelAtPath:languageModelIsJSGF:, however, there is already a listening session in progress which has not been stopped. Please stop this listening session first with [[OEPocketsphinxController sharedInstance] stopListening]; and wait to receive the OEEventsObserver callback pocketsphinxDidStopListening before starting a new session. You can still change models in the existing session by using OEPocketsphinxController’s method changeLanguageModelToFile:withDictionary:
    2015-04-15 12:52:42.987 Dict Shun[7247:2374238] A request has been made to start a listening session using startListeningWithLanguageModelAtPath:dictionaryAtPath:acousticModelAtPath:languageModelIsJSGF:, however, there is already a listening session in progress which has not been stopped. Please stop this listening session first with [[OEPocketsphinxController sharedInstance] stopListening]; and wait to receive the OEEventsObserver callback pocketsphinxDidStopListening before starting a new session. You can still change models in the existing session by using OEPocketsphinxController’s method changeLanguageModelToFile:withDictionary:
    2015-04-15 12:52:43.089 Dict Shun[7247:2374238] A request has been made to start a listening session using startListeningWithLanguageModelAtPath:dictionaryAtPath:acousticModelAtPath:languageModelIsJSGF:, however, there is already a listening session in progress which has not been stopped. Please stop this listening session first with [[OEPocketsphinxController sharedInstance] stopListening]; and wait to receive the OEEventsObserver callback pocketsphinxDidStopListening before starting a new session. You can still change models in the existing session by using OEPocketsphinxController’s method changeLanguageModelToFile:withDictionary:
    2015-04-15 12:52:43.912 Dict Shun[7247:2374353] Speech detected…
    2015-04-15 12:52:43.912 Dict Shun[7247:2374238] Pocketsphinx has detected speech.
    2015-04-15 12:52:44.886 Dict Shun[7247:2374353] End of speech detected…
    2015-04-15 12:52:44.887 Dict Shun[7247:2374238] Pocketsphinx has detected a period of silence, concluding an utterance.

    #1025423
    Halle Winkler
    Politepix

    Hi,

    The route thing isn’t significant unless it is leading to peculiar outcomes with routing. It is normal to see a few route messages in a session. This design is unfortunately not supported:

    – A button to startListening for n seconds.
    [….]
    – At the end of the time period, stopListening.

    So I would assume that the warnings about stopping are related to this.

    #1025424
    rikk
    Participant

    Thanks for quick reply! I figured that the route changes were red herrings.

    Can you please elaborate on “This design is unfortunately not supported.”?

    My app simply starts and stops listening for periods of time.

    #1025425
    Halle Winkler
    Politepix

    Listening for arbitrary periods of time is unfortunately contrary to the utterance-based design of the framework. There is no testing done on using it in that way, so if there are any particular implications to setting up an app that way, they aren’t something I can help with.

    #1025426
    rikk
    Participant

    I apologize for my lack of knowledge in speech recognition science, but I feel like I’m missing something very important. Specifically, what does “utterance-based design” mean?

    Note that my time periods are on the order of 2-30 minutes (not seconds).

    #1025429
    rikk
    Participant

    Ok, you don’t need to explain what “utterance-based design” means. ;-)

    My real question is: Why is my use case “contrary to OE’s design”?

    Starting, listening for a fixed period of time (minutes), and stopping, seems like the simplest and most normal case imaginable.

    Am I missing something?

    Thanks again for your great support!

    #1025430
    rikk
    Participant

    fyi: I solved my issues with start/stop and the app appears to work as expected. :-D

    I’m still worried about your comment that my use case is not a good fit for OpenEars (see previous message). ;-)

    #1025435
    Halle Winkler
    Politepix

    Hi rikk,

    Very glad to hear it was just an issue with stopping. I misunderstood your application to be something like this, where a timed stop is being used to basically interrupt the user mid-utterance and force recognition (an utterance is a continuous period of user speech):

    https://www.politepix.com/forums/topic/stop-speech-recognition-in-desired-time-ex-2-3-sec/

    This design (the push-to-talk design mentioned in the linked thread) is trying to sort of fake OpenEars into not being a continuous listener, which would be better solved by using a much more basic Pocketsphinx implementation rather than trying to get OpenEars to be something it’s not with extra code that adds complexity.

    What you are talking about sounds a bit different – you are using the continuous listening capabilities, and the user can use the session to speak complete utterances, but you have some kind of arbitrary end to the overall listening session period. I’m going to assume that there is a strong rationale for that design in your app requirements which is why you don’t set the period of listening purely based on user input. I wouldn’t expect your setup to be a problem since your stop is essentially similar to the phone giving an interruption.

    #1025442
    rikk
    Participant

    Halle,

    Sincere thanks for the detailed followup.

    My app is trying to be a “bad word detector/trainer.” The idea is that the user chooses a session time (e.g. 2 mins) to practice speaking. As they are talking, my app will detect each time they say a “bad word”, and it will display visual feedback (e.g. unhappy face) and a score (i.e. “You said DAMN 3 times so far”). The app continues detecting and reporting as they talk throughout the session. If they say two “bad words” in succession, I’d like to get two responses from OE (e.g. “DAMN”, “DARN”).

    My current app kinda works, but often misses “bad words” or combines multiple bad words (said in sequence) into a single response (e.g. “DARN BUMMER”, instead of “DARN” and “BUMMER”.

    I’m wondering if I would benefit from your suggestion of recording .WAV (even though I don’t understand why that helps), or if RapidEars is the right choice.

    Thoughts?

    Thanks again,
    Rikk

    #1025444
    Halle Winkler
    Politepix

    Hi,

    I’m wondering if I would benefit from your suggestion of recording .WAV

    I didn’t recommend this and would not ever do so – that impression is due to a misunderstanding. I didn’t link to the other thread in order to show you advice for your design, I linked to it to explain a specific design that I think should _never_ be done with OpenEars, as a way of explaining to you that I didn’t consider your design to have the same problem. This was directly in response to your statement that you were worried that your design was a mis-fit for the library.

    I told the poster in that discussion that the only way it was possible for him to use OpenEars for his design was by recording a WAV file and submitting it to the WAV function, not because it has any advantages or because it is a good idea but because it is otherwise not possible with my support at all. None of this is your problem because you are using OpenEars for continuous listening as it is designed for, just with some kind of eventual end point that is a bit arbitrary, so please consider the question of whether your design is OK to be closed (it is) and please don’t take design advice from that thread.

    My current app kinda works, but often misses “bad words” or combines multiple bad words (said in sequence) into a single response (e.g. “DARN BUMMER”, instead of “DARN” and “BUMMER”.

    The job of parsing a hypothesis for multiple words you are interested is an implementation issue for your app. If you receive this hypothesis, you can check it against your word list and see if there is more than one word from it in there. Rejecto may help with your false negatives.

    if RapidEars is the right choice.

    RapidEars will give you hypotheses sooner, but they will contain similar content to the regular hypotheses. Would that be helpful to you?

    #1025448
    rikk
    Participant

    Ok, got it. All makes sense. Thanks for your thoughtful answers!

Viewing 12 posts - 1 through 12 (of 12 total)
  • You must be logged in to reply to this topic.