Forum Replies Created
February 25, 2014 at 2:15 am in reply to: How to combine wave files generated by SaveThatWave? #1020312
After lots of bisecting tests, I get weight=1.2. This value seems rather stable in term of CPU utilization. If I went over 1.2, my app will lock into long search quite easily.
In terms of OOV rejection, I think I will need to do some app level optimization and not rely on Rejecto framework 100%.
Thanks for all you helps.
-ThomasFebruary 21, 2014 at 11:52 pm in reply to: How to combine wave files generated by SaveThatWave? #1020290
I tested on a iPhone 5s and an iPad Mini first gen (I believe this is a 32 bits device) they both show the symptom if weight is not nil. They both run iOS 7.0.4.
-ThomasFebruary 21, 2014 at 11:03 pm in reply to: How to combine wave files generated by SaveThatWave? #1020288
I have mostly good news to report. I modified my app to dynamically generate both set of grammars as you suggested with Rejecto, and set the weight parameter of both to nil, with the latest beta dist. So far, the CPU utilization has been under controlled. Sometimes it went up high but would drop very quickly. I am happy to report there is no noticeable performance impact with this solution, and hypothesis does not contain rejected words either. I am curious as why rejected words are delivered when using pre-generated grammar. I do not need to pre-create grammars at this time but if its number increases, we may need to reconsider it in the future.
Another observation is weight. Rejecto still delivers too many out-of-vocabulary words. I tried to set weight to 1.3, 1, 0.8, 0.5 and 0. In every scenario, the CPU spike re-appears. It seems only nil will allow the framework to function properly.
If I can’t use weight, is there other method allowing tuning of the Rejecto behavior? Thanks
-ThomasFebruary 21, 2014 at 5:51 pm in reply to: How to combine wave files generated by SaveThatWave? #1020286
Correct me if I am wrong:
1. I generate the first LM dynamically following the sample app. I just use Rejeco class instead. Then I mod the program to generate the second LM using a different “withFileNamed” parameter
2. I then go to the device (or simulator) Library/Cache directory to copy the resulting dic and DMP files (4 of them total), then add them the Vocabulary folder of the sample app project
3. Lastly I modified the sample app to use the newly added LM from Vocabulary folder (it is the main bundle), and comment the code does dynamic LM generation out.
Is this the right procedure?
The project download link was emailed to you.
-ThomasFebruary 21, 2014 at 5:34 pm in reply to: How to combine wave files generated by SaveThatWave? #1020284
Got it. Will do it now.
-ThomasFebruary 21, 2014 at 5:06 pm in reply to: How to combine wave files generated by SaveThatWave? #1020282
__REJ problem did not happen in the first test I sent you. After I modify the test case based on your post to use DMP from the main bundle, it started to happen. If this is proper behavior for default debug output,how to turn if off?
I am sending you another email with link to download the X-code project containing the sample app with this symptom. The changes are very limited and controlled in ViewDidLoad method.
-ThomasFebruary 21, 2014 at 4:03 am in reply to: How to combine wave files generated by SaveThatWave? #1020269
I think I found the source of the problem. I used weight as an NSNumber=1.5 in the test. After reading your post/doc again I change it back to nil. Since then the CPU utilization has become a lot better. I also regenerate both language model but I suspect this is not the problem. Both corpus files are in the VocabularyFiles folder with .txt extension.
I was using weight=1.5 because the framework picks up too many out of vocabulary words with weight=nil. I was hoping increasing weight will make rejecto more effective but it seems to cause high CPU problem.
I am still struggling with __REJ (rejected words) being delivered in hypothesis. The function:
[languageModelGenerator deliverRejectedSpeechInHypotheses: (BOOL)FALSE];
seems to have no effect in my test setup (I tried using both TRUE or FALSE).
-ThomasFebruary 21, 2014 at 2:49 am in reply to: How to combine wave files generated by SaveThatWave? #1020268
I downloaded and installed the beta dist and I see no VAD recalibration problem anymore so thank you for fixing this one.
On the other hand, the major problem in Rejecto is still there. I modified the test to use only one language model from the main bundle at one time. The two models are created separately using generateRejectingLanguageModelFromArray method as in the document from two simulator runs. Then I build them into the main bundle and test one at a time in a real iPhone 5s device. For both language models, I still see high CPU utilization when I said a medium or long sentence (5+ words) using unrelated words (out of dictionary words). I noticed the longer the sentence is, the longer the CPU stays at 100%.
Another question, I tried using:
LanguageModelGenerator *languageModelGenerator = [[LanguageModelGenerator alloc] init];
[languageModelGenerator deliverRejectedSpeechInHypotheses: (BOOL)FALSE];
To eliminate the __REJ words in hypothesis but I still got them no matter I use this function or not. The document indicates I should not need to use it because by default the rejected word will not be delivered. However, the hypothesis always contain them no matter this function is used or not, and if I use it, no matter the parameter is set to TRUE or FALSE.
-ThomasFebruary 18, 2014 at 7:02 pm in reply to: How to combine wave files generated by SaveThatWave? #1020198
I will try to reproduce it using the sample app, and send it to you.
-ThomasFebruary 18, 2014 at 6:26 pm in reply to: How to combine wave files generated by SaveThatWave? #1020193
I retest our app in a controlled environment: one single speaker, quiet room with no noise, spoke with a normal sentence (less than 15 words) in 5 seconds but that still caused the framework to enter “processing speech…” to almost 50 seconds with high CPU utilization. During this period, the framework does not generate any output, nor does it handle any input (iphone UI is still functioning though). This is re-producable.
2014-02-18 09:19:05.443 hear4me[3294:600f] Speech detected…
2014-02-18 09:19:05.446 hear4me[3294:60b] Pocketsphinx has detected speech.
2014-02-18 09:19:08.989 hear4me[3294:600f] Stopping audio unit.
2014-02-18 09:19:08.989 hear4me[3294:60b] Pocketsphinx has detected a second of silence, concluding an utterance.
2014-02-18 09:19:09.119 hear4me[3294:600f] Audio Output Unit stopped, cleaning up variable states.
2014-02-18 09:19:09.120 hear4me[3294:600f] Processing speech, please wait…
2014-02-18 09:19:58.316 hear4me[3294:600f] Pocketsphinx heard “WEIGHT VASCULAR” with a score of (-242625) and an utterance ID of 000000005.February 18, 2014 at 5:40 pm in reply to: How to combine wave files generated by SaveThatWave? #1020191
I see. It is not a normal use case (continuous sound without utterance). I will test more and report my findings.
However, in my usage scenario, it is possible that more than one speakers (up to 3) could speak at the same time. The distance of mic to each speaker will be different. How does OpenEars judge an audio “noise”, or a “sentence”? If a remote speaker, say 3 feet away, is talking normally. Then all of a sudden a closeup speaker also speak. The device will detect a sudden rise of noise/sentence volume how will this impact recognition behavior after this point?
ThanksFebruary 18, 2014 at 4:47 pm in reply to: How to combine wave files generated by SaveThatWave? #1020189
Detecting both conversation (3-5 feet away from the device) as well as close-up recording to the mic are both valid use cases. We don’t anticipate recognizing faint, far away conversation… but I do want to be able to recognize speech when the conversation is directed toward the listener, (aka the talker and the device user are different person). In such use case, the background noise level may vary more in a quick multi-parties conversation rather than a single speaker scenario.
Is this doable without modifying the framework? Thanks
-ThomasFebruary 17, 2014 at 7:27 pm in reply to: How to combine wave files generated by SaveThatWave? #1020171
We use all 1.65 frameworks (openear, rejecto and savethatwave) in our app. I notice when I speak more than 10 words in a sentence, it is typically followed by a long period (30 seconds) of device high cpu utilization (99%) while displaying “Processing speech, please wait…”.
This does not happen when I speak just one of two words in a sentence from the dictionary.
Is this normal? Is rapidear framework supposed to reduce the processing delay?
———— console log attached —————-
2014-02-17 09:53:07.048 hear4me[2715:640f] Speech detected…
2014-02-17 09:53:07.050 hear4me[2715:60b] Pocketsphinx has detected speech.
2014-02-17 09:53:09.586 hear4me[2715:640f] There is reason to suspect the VAD of being out of sync with the current background noise levels in the environment so we will recalibrate.
2014-02-17 09:53:09.587 hear4me[2715:640f] Stopping audio unit.
2014-02-17 09:53:09.587 hear4me[2715:60b] Pocketsphinx has detected a second of silence, concluding an utterance.
2014-02-17 09:53:09.684 hear4me[2715:640f] Audio Output Unit stopped, cleaning up variable states.
2014-02-17 09:53:09.684 hear4me[2715:640f] Processing speech, please wait…
2014-02-17 09:53:42.047 hear4me[2715:640f] Pocketsphinx heard “INTAKE OF” with a score of (-149766) and an utterance ID of 000000001.
2014-02-17 09:53:42.048 hear4me[2715:60b] The received hypothesis is INTAKE OF with a score of -149766 and an ID of 000000001February 7, 2014 at 8:43 pm in reply to: How to combine wave files generated by SaveThatWave? #1020080
Is the upcoming SaveThatWave available yet? Thanks
-ThomasJanuary 29, 2014 at 4:42 pm in reply to: How to combine wave files generated by SaveThatWave? #1019980
Thank you Halle. We are looking forward to incorporating its new capability.
-ThomasJanuary 28, 2014 at 11:59 pm in reply to: How to combine wave files generated by SaveThatWave? #1019966
Will this be a free upgrade for the SaveThatWave 1.64 licensee and when will this update be available? Thanks
Thanks for the suggestion. Looks easy enough never mind about adding the new interface then.