tyeh

Forum Replies Created

Viewing 17 posts - 1 through 17 (of 17 total)

  • Author
    Posts
  • in reply to: How to combine wave files generated by SaveThatWave? #1020312
    tyeh
    Participant

    Halle,
    After lots of bisecting tests, I get weight=1.2. This value seems rather stable in term of CPU utilization. If I went over 1.2, my app will lock into long search quite easily.
    In terms of OOV rejection, I think I will need to do some app level optimization and not rely on Rejecto framework 100%.
    Thanks for all you helps.
    -Thomas

    in reply to: How to combine wave files generated by SaveThatWave? #1020290
    tyeh
    Participant

    Halle,
    I tested on a iPhone 5s and an iPad Mini first gen (I believe this is a 32 bits device) they both show the symptom if weight is not nil. They both run iOS 7.0.4.
    -Thomas

    in reply to: How to combine wave files generated by SaveThatWave? #1020288
    tyeh
    Participant

    Hi Halle,
    I have mostly good news to report. I modified my app to dynamically generate both set of grammars as you suggested with Rejecto, and set the weight parameter of both to nil, with the latest beta dist. So far, the CPU utilization has been under controlled. Sometimes it went up high but would drop very quickly. I am happy to report there is no noticeable performance impact with this solution, and hypothesis does not contain rejected words either. I am curious as why rejected words are delivered when using pre-generated grammar. I do not need to pre-create grammars at this time but if its number increases, we may need to reconsider it in the future.

    Another observation is weight. Rejecto still delivers too many out-of-vocabulary words. I tried to set weight to 1.3, 1, 0.8, 0.5 and 0. In every scenario, the CPU spike re-appears. It seems only nil will allow the framework to function properly.

    If I can’t use weight, is there other method allowing tuning of the Rejecto behavior? Thanks

    -Thomas

    in reply to: How to combine wave files generated by SaveThatWave? #1020286
    tyeh
    Participant

    Halle,
    Correct me if I am wrong:
    1. I generate the first LM dynamically following the sample app. I just use Rejeco class instead. Then I mod the program to generate the second LM using a different “withFileNamed” parameter
    2. I then go to the device (or simulator) Library/Cache directory to copy the resulting dic and DMP files (4 of them total), then add them the Vocabulary folder of the sample app project
    3. Lastly I modified the sample app to use the newly added LM from Vocabulary folder (it is the main bundle), and comment the code does dynamic LM generation out.
    Is this the right procedure?
    The project download link was emailed to you.
    Thanks
    -Thomas

    in reply to: How to combine wave files generated by SaveThatWave? #1020284
    tyeh
    Participant

    Halle,
    Got it. Will do it now.
    -Thomas

    in reply to: How to combine wave files generated by SaveThatWave? #1020282
    tyeh
    Participant

    Halle,
    __REJ problem did not happen in the first test I sent you. After I modify the test case based on your post to use DMP from the main bundle, it started to happen. If this is proper behavior for default debug output,how to turn if off?
    I am sending you another email with link to download the X-code project containing the sample app with this symptom. The changes are very limited and controlled in ViewDidLoad method.
    -Thomas

    in reply to: How to combine wave files generated by SaveThatWave? #1020269
    tyeh
    Participant

    Halle,

    I think I found the source of the problem. I used weight as an NSNumber=1.5 in the test. After reading your post/doc again I change it back to nil. Since then the CPU utilization has become a lot better. I also regenerate both language model but I suspect this is not the problem. Both corpus files are in the VocabularyFiles folder with .txt extension.

    I was using weight=1.5 because the framework picks up too many out of vocabulary words with weight=nil. I was hoping increasing weight will make rejecto more effective but it seems to cause high CPU problem.

    I am still struggling with __REJ (rejected words) being delivered in hypothesis. The function:
    [languageModelGenerator deliverRejectedSpeechInHypotheses: (BOOL)FALSE];
    seems to have no effect in my test setup (I tried using both TRUE or FALSE).
    Thanks
    -Thomas

    in reply to: How to combine wave files generated by SaveThatWave? #1020268
    tyeh
    Participant

    Halle,
    I downloaded and installed the beta dist and I see no VAD recalibration problem anymore so thank you for fixing this one.
    On the other hand, the major problem in Rejecto is still there. I modified the test to use only one language model from the main bundle at one time. The two models are created separately using generateRejectingLanguageModelFromArray method as in the document from two simulator runs. Then I build them into the main bundle and test one at a time in a real iPhone 5s device. For both language models, I still see high CPU utilization when I said a medium or long sentence (5+ words) using unrelated words (out of dictionary words). I noticed the longer the sentence is, the longer the CPU stays at 100%.

    Another question, I tried using:
    LanguageModelGenerator *languageModelGenerator = [[LanguageModelGenerator alloc] init];
    [languageModelGenerator deliverRejectedSpeechInHypotheses: (BOOL)FALSE];
    To eliminate the __REJ words in hypothesis but I still got them no matter I use this function or not. The document indicates I should not need to use it because by default the rejected word will not be delivered. However, the hypothesis always contain them no matter this function is used or not, and if I use it, no matter the parameter is set to TRUE or FALSE.

    -Thomas

    in reply to: How to combine wave files generated by SaveThatWave? #1020198
    tyeh
    Participant

    I will try to reproduce it using the sample app, and send it to you.
    -Thomas

    in reply to: How to combine wave files generated by SaveThatWave? #1020193
    tyeh
    Participant

    Halle,
    I retest our app in a controlled environment: one single speaker, quiet room with no noise, spoke with a normal sentence (less than 15 words) in 5 seconds but that still caused the framework to enter “processing speech…” to almost 50 seconds with high CPU utilization. During this period, the framework does not generate any output, nor does it handle any input (iphone UI is still functioning though). This is re-producable.
    =========================
    2014-02-18 09:19:05.443 hear4me[3294:600f] Speech detected…
    2014-02-18 09:19:05.446 hear4me[3294:60b] Pocketsphinx has detected speech.
    2014-02-18 09:19:08.989 hear4me[3294:600f] Stopping audio unit.
    2014-02-18 09:19:08.989 hear4me[3294:60b] Pocketsphinx has detected a second of silence, concluding an utterance.
    2014-02-18 09:19:09.119 hear4me[3294:600f] Audio Output Unit stopped, cleaning up variable states.
    2014-02-18 09:19:09.120 hear4me[3294:600f] Processing speech, please wait…
    2014-02-18 09:19:58.316 hear4me[3294:600f] Pocketsphinx heard “WEIGHT VASCULAR” with a score of (-242625) and an utterance ID of 000000005.

    in reply to: How to combine wave files generated by SaveThatWave? #1020191
    tyeh
    Participant

    I see. It is not a normal use case (continuous sound without utterance). I will test more and report my findings.

    However, in my usage scenario, it is possible that more than one speakers (up to 3) could speak at the same time. The distance of mic to each speaker will be different. How does OpenEars judge an audio “noise”, or a “sentence”? If a remote speaker, say 3 feet away, is talking normally. Then all of a sudden a closeup speaker also speak. The device will detect a sudden rise of noise/sentence volume how will this impact recognition behavior after this point?

    Thanks

    in reply to: How to combine wave files generated by SaveThatWave? #1020189
    tyeh
    Participant

    Detecting both conversation (3-5 feet away from the device) as well as close-up recording to the mic are both valid use cases. We don’t anticipate recognizing faint, far away conversation… but I do want to be able to recognize speech when the conversation is directed toward the listener, (aka the talker and the device user are different person). In such use case, the background noise level may vary more in a quick multi-parties conversation rather than a single speaker scenario.
    Is this doable without modifying the framework? Thanks
    -Thomas

    in reply to: How to combine wave files generated by SaveThatWave? #1020171
    tyeh
    Participant

    Halle,
    We use all 1.65 frameworks (openear, rejecto and savethatwave) in our app. I notice when I speak more than 10 words in a sentence, it is typically followed by a long period (30 seconds) of device high cpu utilization (99%) while displaying “Processing speech, please wait…”.
    This does not happen when I speak just one of two words in a sentence from the dictionary.
    Is this normal? Is rapidear framework supposed to reduce the processing delay?
    Thanks
    -Thomas

    ———— console log attached —————-

    2014-02-17 09:53:07.048 hear4me[2715:640f] Speech detected…
    2014-02-17 09:53:07.050 hear4me[2715:60b] Pocketsphinx has detected speech.
    2014-02-17 09:53:09.586 hear4me[2715:640f] There is reason to suspect the VAD of being out of sync with the current background noise levels in the environment so we will recalibrate.
    2014-02-17 09:53:09.587 hear4me[2715:640f] Stopping audio unit.
    2014-02-17 09:53:09.587 hear4me[2715:60b] Pocketsphinx has detected a second of silence, concluding an utterance.
    2014-02-17 09:53:09.684 hear4me[2715:640f] Audio Output Unit stopped, cleaning up variable states.
    2014-02-17 09:53:09.684 hear4me[2715:640f] Processing speech, please wait…
    2014-02-17 09:53:42.047 hear4me[2715:640f] Pocketsphinx heard “INTAKE OF” with a score of (-149766) and an utterance ID of 000000001.
    2014-02-17 09:53:42.048 hear4me[2715:60b] The received hypothesis is INTAKE OF with a score of -149766 and an ID of 000000001

    in reply to: How to combine wave files generated by SaveThatWave? #1020080
    tyeh
    Participant

    Hi Halle,
    Is the upcoming SaveThatWave available yet? Thanks
    -Thomas

    in reply to: How to combine wave files generated by SaveThatWave? #1019980
    tyeh
    Participant

    Thank you Halle. We are looking forward to incorporating its new capability.
    -Thomas

    in reply to: How to combine wave files generated by SaveThatWave? #1019966
    tyeh
    Participant

    Will this be a free upgrade for the SaveThatWave 1.64 licensee and when will this update be available? Thanks
    -Thomas

    in reply to: Generate Rejecto LM from Text File #1019964
    tyeh
    Participant

    Thanks for the suggestion. Looks easy enough never mind about adding the new interface then.
    -Thomas

Viewing 17 posts - 1 through 17 (of 17 total)