OpenEars intigartion with camera control

Home Forums OpenEars OpenEars intigartion with camera control

Viewing 14 posts - 1 through 14 (of 14 total)

  • Author
    Posts
  • #11361
    jay_stepin
    Participant

    Hello,
    I have made a demo of voice recognition with openEars and its working fine. But now i want to integrate it with camera control. For example when user shout START camera will start video recording and stop video recording with STOP.
    My first question is it possible with OpenEars??
    And if yes can you help me to get start.
    Do any one know any such application??

    #11420
    Halle Winkler
    Politepix

    Hi Jay,

    This question is a little too broad for the forum, sorry. Feel free to ask specific questions about your code.

    #11422
    jay_stepin
    Participant

    Okay.
    So i try to get it work with UIImagePickerController, but when UIImagePickerController comes in play

    LanguageModelGenerator *lmGenerator = [[LanguageModelGenerator alloc] init];

    NSArray *words = [NSArray arrayWithObjects:@”START”, @”STOP”, @”HAVARD”, nil];
    NSString *name = @”Mahadev”;
    NSError *err = [lmGenerator generateLanguageModelFromArray:words withFilesNamed:name];
    NSLog(@”%@”,err);
    NSDictionary *languageGeneratorResults = nil;

    lmPath = nil;
    dicPath = nil;

    if([err code] == noErr) {

    languageGeneratorResults = [err userInfo];

    lmPath = [languageGeneratorResults objectForKey:@”LMPath”];
    dicPath = [languageGeneratorResults objectForKey:@”DictionaryPath”];

    } else {
    NSLog(@”Error: %@”,[err localizedDescription]);
    }

    upto this coding works fine but

    [self.pocketsphinxController startListeningWithLanguageModelAtPath:lmPath dictionaryAtPath:dicPath languageModelIsJSGF:NO];

    this method doesn’t trigger, so i do not get

    – (void) pocketsphinxDidStartListening {
    NSLog(@”Pocketsphinx is now listening.”);
    }

    working….

    If you can help me with this it will be awesome.

    I know this is a messy kind of question, so feel free if you need more specifications.

    #11423
    Halle Winkler
    Politepix

    What is the relationship between the picker code and the code above? Probably it’s not the case that startListeningWithLanguageModelAtPath: doesn’t trigger but rather that it gets to a certain point in the loop and has trouble. If you turn on verbosePocketSphinx and OpenEarsLogging the output will probably tell you a lot about the reason that startListeningWithLanguageModelAtPath: isn’t getting good results. You can search the log output for the words “error” or “warning” specifically or you can post it here (but please make sure both forms of logging have been turned on first so I can really see everything that is happening).

    #11448
    jay_stepin
    Participant

    Hello Halle,
    As per your suggestion i turn on OpenEarsLogging and get consol output like

    2012-10-01 18:44:25.956 videoDemo[1085:307] Starting OpenEars logging for OpenEars version {{{{1.2.2}}}} on device: iPhone running iOS version: 4.100000
    2012-10-01 18:44:25.965 videoDemo[1085:307] Normalized array contains the following entries:
    (
    START,
    STOP,
    HAVARD
    )
    2012-10-01 18:44:25.977 videoDemo[1085:307] Starting dynamic language model generation
    2012-10-01 18:44:25.988 videoDemo[1085:307] Able to open /var/mobile/Applications/E7438B2B-52F3-40DD-BE93-F444AD537A32/Documents/Mahadev.corpus for reading
    2012-10-01 18:44:25.991 videoDemo[1085:307] Able to open /var/mobile/Applications/E7438B2B-52F3-40DD-BE93-F444AD537A32/Documents/Mahadev_pipe.txt for writing
    2012-10-01 18:44:25.994 videoDemo[1085:307] Starting text2wfreq_impl
    2012-10-01 18:44:26.032 videoDemo[1085:307] Done with text2wfreq_impl
    2012-10-01 18:44:26.037 videoDemo[1085:307] Able to open /var/mobile/Applications/E7438B2B-52F3-40DD-BE93-F444AD537A32/Documents/Mahadev_pipe.txt for reading.
    2012-10-01 18:44:26.041 videoDemo[1085:307] Able to open /var/mobile/Applications/E7438B2B-52F3-40DD-BE93-F444AD537A32/Documents/Mahadev.vocab for reading.
    2012-10-01 18:44:26.044 videoDemo[1085:307] Starting wfreq2vocab
    2012-10-01 18:44:26.049 videoDemo[1085:307] Done with wfreq2vocab
    2012-10-01 18:44:26.057 videoDemo[1085:307] Starting text2idngram
    2012-10-01 18:44:26.094 videoDemo[1085:307] Done with text2idngram
    2012-10-01 18:44:26.103 videoDemo[1085:307] Starting idngram2lm

    2012-10-01 18:44:26.145 videoDemo[1085:307] Done with idngram2lm
    2012-10-01 18:44:26.149 videoDemo[1085:307] Starting sphinx_lm_convert
    2012-10-01 18:44:26.165 videoDemo[1085:307] Finishing sphinx_lm_convert
    2012-10-01 18:44:26.178 videoDemo[1085:307] Done creating language model with CMUCLMTK in 0.198320 seconds.
    2012-10-01 18:44:26.550 videoDemo[1085:307] I’m done running performDictionaryLookup and it took 0.309528 seconds
    2012-10-01 18:44:26.563 videoDemo[1085:307] I’m done running dynamic language model generation and it took 370790066.563663 seconds

    Just this not ant further.

    #11449
    Halle Winkler
    Politepix

    Hi,

    I don’t think that logging has verbosePocketSphinx enabled. If it does, that means that your app has an issue that is blocking before [self.pocketsphinxController startListeningWithLanguageModelAtPath:lmPath dictionaryAtPath:dicPath languageModelIsJSGF:NO];, not during it, since the logging would show it starting but stopping somewhere, but this logging shows nothing that happens after the language model is generated. I recommended showing the relationship between your picker code and the OpenEars code before — without that or the output with verbosePocketsphinx there’s no way to know what is happening since the code above is the code which works in the sample app.

    #11452
    jay_stepin
    Participant

    Hello,

    I m really sorry for my explanation its really bad.

    So i have create a viewController and put all the method of OpenEars in viewDidload and add this controller as camera overlay.

    Pardon me but i don’t understand your point by “I don’t think that logging has verbosePocketSphinx enabled”, can you explain it more so i can help my self by giving you more info.

    #11454
    Halle Winkler
    Politepix

    Hi Jay,

    No problem, just go to https://www.politepix.com/openears and search the page for “verbosePocketSphinx” and you’ll see the property definition. There is also an example of using it in the sample app view controller if you search for that string.

    #11488
    jay_stepin
    Participant

    Hello Halle,

    As per your suggestion i have start verbosePocketSphinx and got following log in console.

    2012-10-03 12:25:30.609 videoDemo[1871:307] Starting OpenEars logging for OpenEars version {{{{1.2.2}}}} on device: iPhone running iOS version: 4.100000
    2012-10-03 12:25:30.619 videoDemo[1871:307] Normalized array contains the following entries:
    (
    START,
    STOP,
    HAVARD
    )
    2012-10-03 12:25:30.631 videoDemo[1871:307] Starting dynamic language model generation
    2012-10-03 12:25:30.641 videoDemo[1871:307] Able to open /var/mobile/Applications/E7438B2B-52F3-40DD-BE93-F444AD537A32/Documents/Mahadev.corpus for reading
    2012-10-03 12:25:30.645 videoDemo[1871:307] Able to open /var/mobile/Applications/E7438B2B-52F3-40DD-BE93-F444AD537A32/Documents/Mahadev_pipe.txt for writing
    2012-10-03 12:25:30.650 videoDemo[1871:307] Starting text2wfreq_impl
    2012-10-03 12:25:30.692 videoDemo[1871:307] Done with text2wfreq_impl
    2012-10-03 12:25:30.697 videoDemo[1871:307] Able to open /var/mobile/Applications/E7438B2B-52F3-40DD-BE93-F444AD537A32/Documents/Mahadev_pipe.txt for reading.
    2012-10-03 12:25:30.702 videoDemo[1871:307] Able to open /var/mobile/Applications/E7438B2B-52F3-40DD-BE93-F444AD537A32/Documents/Mahadev.vocab for reading.
    2012-10-03 12:25:30.707 videoDemo[1871:307] Starting wfreq2vocab
    2012-10-03 12:25:30.714 videoDemo[1871:307] Done with wfreq2vocab
    2012-10-03 12:25:30.720 videoDemo[1871:307] Starting text2idngram
    2012-10-03 12:25:30.758 videoDemo[1871:307] Done with text2idngram
    2012-10-03 12:25:30.767 videoDemo[1871:307] Starting idngram2lm

    2012-10-03 12:25:30.808 videoDemo[1871:307] Done with idngram2lm
    2012-10-03 12:25:30.811 videoDemo[1871:307] Starting sphinx_lm_convert
    2012-10-03 12:25:30.824 videoDemo[1871:307] Finishing sphinx_lm_convert
    2012-10-03 12:25:30.836 videoDemo[1871:307] Done creating language model with CMUCLMTK in 0.202123 seconds.
    2012-10-03 12:25:31.201 videoDemo[1871:307] I’m done running performDictionaryLookup and it took 0.300287 seconds
    2012-10-03 12:25:31.213 videoDemo[1871:307] I’m done running dynamic language model generation and it took 370940131.213007 seconds
    2012-10-03 12:25:31.219 videoDemo[1871:307] JAY TEsting
    2012-10-03 12:25:31.226 videoDemo[1871:307] A sample rate was requested that isn’t one of the two supported values of 16000 or 8000 so we will use the default of 16000.
    2012-10-03 12:25:31.241 videoDemo[1871:307] The audio session has never been initialized so we will do that now.
    2012-10-03 12:25:31.247 videoDemo[1871:307] Checking and resetting all audio session settings.
    2012-10-03 12:25:31.250 videoDemo[1871:307] audioCategory is incorrect, we will change it.
    2012-10-03 12:25:31.253 videoDemo[1871:307] audioCategory is now on the correct setting of kAudioSessionCategory_PlayAndRecord.
    2012-10-03 12:25:31.256 videoDemo[1871:307] bluetoothInput is incorrect, we will change it.
    2012-10-03 12:25:31.258 videoDemo[1871:307] bluetooth input is now on the correct setting of 1.
    2012-10-03 12:25:31.261 videoDemo[1871:307] categoryDefaultToSpeaker is incorrect, we will change it.
    2012-10-03 12:25:31.269 videoDemo[1871:307] CategoryDefaultToSpeaker is now on the correct setting of 1.
    2012-10-03 12:25:31.272 videoDemo[1871:307] preferredBufferSize is incorrect, we will change it.
    2012-10-03 12:25:31.275 videoDemo[1871:307] PreferredBufferSize is now on the correct setting of 0.128000.
    2012-10-03 12:25:31.278 videoDemo[1871:307] preferredSampleRateCheck is incorrect, we will change it.
    2012-10-03 12:25:31.280 videoDemo[1871:307] preferred hardware sample rate is now on the correct setting of 16000.000000.
    2012-10-03 12:25:31.422 videoDemo[1871:307] AudioSessionManager startAudioSession has reached the end of the initialization.
    2012-10-03 12:25:31.427 videoDemo[1871:307] Exiting startAudioSession.
    2012-10-03 12:25:31.434 videoDemo[1871:560f] Recognition loop has started
    2012-10-03 12:25:31.568 videoDemo[1871:307] Using two-stage rotation animation. To use the smoother single-stage animation, this application must remove two-stage method implementations.
    2012-10-03 12:25:31.696 videoDemo[1871:307] Using two-stage rotation animation is not supported when rotating more than one view controller or view controllers not the window delegate
    2012-10-03 12:25:32.457 videoDemo[1871:560f] Starting openAudioDevice on the device.
    2012-10-03 12:25:32.472 videoDemo[1871:560f] Audio unit wrapper successfully created.
    2012-10-03 12:25:32.493 videoDemo[1871:560f] Set audio route to SpeakerAndMicrophone
    2012-10-03 12:25:32.500 videoDemo[1871:560f] Checking and resetting all audio session settings.
    2012-10-03 12:25:32.505 videoDemo[1871:560f] audioCategory is correct, we will leave it as it is.
    2012-10-03 12:25:32.508 videoDemo[1871:560f] bluetoothInput is correct, we will leave it as it is.
    2012-10-03 12:25:32.510 videoDemo[1871:560f] categoryDefaultToSpeaker is correct, we will leave it as it is.
    2012-10-03 12:25:32.517 videoDemo[1871:560f] preferredBufferSize is correct, we will leave it as it is.
    2012-10-03 12:25:32.521 videoDemo[1871:560f] preferredSampleRateCheck is correct, we will leave it as it is.
    2012-10-03 12:25:32.524 videoDemo[1871:560f] Setting the variables for the device and starting it.
    2012-10-03 12:25:32.537 videoDemo[1871:560f] Looping through ringbuffer sections and pre-allocating them.
    2012-10-03 12:25:33.356 videoDemo[1871:560f] Started audio output unit.
    2012-10-03 12:25:33.360 videoDemo[1871:560f] Calibration has started
    2012-10-03 12:25:34.764 videoDemo[1871:307] The Audio Session was interrupted.
    2012-10-03 12:25:35.569 videoDemo[1871:560f] Calibration has completed
    2012-10-03 12:25:35.573 videoDemo[1871:560f] Project has these words in its dictionary:
    HAVARD
    START
    STOP
    2012-10-03 12:25:35.576 videoDemo[1871:560f] Listening.

    #11489
    Halle Winkler
    Politepix

    OK, I think the issue is simply that the audio stream is not provided to PocketsphinxController since it is used by the video picker.

    #11490
    jay_stepin
    Participant

    OK, so any thing we can do about it??

    #11491
    Halle Winkler
    Politepix

    Solving this would be an advanced undertaking that would require you to thoroughly research the iOS audio session and do a lot of self-guided experimentation in order to learn what is needed. Maybe it’s possible but it isn’t something I can walk you through unfortunately.

    #11492
    jay_stepin
    Participant

    Ahhh. Thats Heart Breaking. Thanks mate for the entire help. I really appreciate your work and help. Many many Thanks.

    #11493
    Halle Winkler
    Politepix

    You’re welcome, good luck with your app.

Viewing 14 posts - 1 through 14 (of 14 total)
  • You must be logged in to reply to this topic.