HomeForumsOpenEars[Resolved] Volume control with text-to-speech

This topic has 2 voices, contains 34 replies, and was last updated by  ksamurai 214 days ago.

Viewing 35 posts - 1 through 35 (of 35 total)
Author Posts
Author Posts
October 16, 2011 at 10:11 pm #7700

ksamurai

First of all, I have to say how impressed I am with OpenEars so far. It was easy to get it set up and start using it right away.

I just have one question about controlling the volume of the OpenEars voices. I am using the text-to-speech functions provided by the FliteController, and am successfully generating speech from some strings. In headphones, the voices are great–they have a full range of volume from really quiet to blow-your-ears-out. However, when I play the voices through the speakers, the voices are very quiet. Even at maximum volume, the voices are quite hard to hear.

Is there a way that I can turn up their maximum volume? It would be great if I could adjust the speaker volumes separately from the headphone volumes.

Thanks,
ksamurai

October 17, 2011 at 7:18 am #7701

Halle

Hi,

This isn’t expected behavior — which device are you seeing this with, which OS version, and have you made any changes to AudioSessionManager (and have you used AudioSessionManager as described)? Is it happening in the sample app?

October 17, 2011 at 3:42 pm #7705

ksamurai

Hi Halle,

Thanks for the response!
I am on the first iTouch, on OS 4.2.1.
I have not made any changes to the AudioSessionManager, but am making sure now that I have followed the directions correctly.
The sample app crashes, so I cannot test it. I believe this is because I do not have a microphone on the device, so it is unable to listen for sounds.

Also, some strange behavior I am getting when I compile that I wasn’t seeing yesterday:
“_iconv_open”, referenced from:
_ngram_model_recode in libOpenEarsLibrary.a(ngram_model.o)
“_iconv”, referenced from:
_ngram_model_recode in libOpenEarsLibrary.a(ngram_model.o)
_ngram_model_recode in libOpenEarsLibrary.a(ngram_model.o)
_ngram_model_recode in libOpenEarsLibrary.a(ngram_model.o)
_ngram_model_recode in libOpenEarsLibrary.a(ngram_model.o)
“_iconv_close”, referenced from:
_ngram_model_recode in libOpenEarsLibrary.a(ngram_model.o)
ld: symbol(s) not found

I don’t know what happened here, since things were compiling just fine without error last night.

October 17, 2011 at 3:51 pm #7706

ksamurai

Oops, please disregard the last list of errors. I was mistakenly trying to compile a simulator version. Now I am looking into the AudioSessionManager process to see if I missed something.

October 17, 2011 at 4:03 pm #7707

ksamurai

So I did indeed miss the step with the AudioSessionManager. However, now that I have added it in, I have no sound at all! I still must be missing something basic here….

October 17, 2011 at 4:37 pm #7708

Halle

Hiya,

Hmm, every issue you’re mentioning sounds like you’ve missed some steps in the instructions. There should be no errors when compiling a simulator version, the only difference is that the simulator driver is less accurate doing recognition.

Also, if the sample app is crashing for you, you can turn on the logging and see why. I’d start fresh and give some extra time to the instructions.

October 17, 2011 at 4:43 pm #7709

ksamurai

Okay thanks. It also makes it a little tricky since I am using XCode 3, not 4, so the instructions do not always line up 100%. :)
I will go through it again though and see if I missed something.

October 17, 2011 at 4:49 pm #7710

ksamurai

The sample program crashes at:
audioDevice->recordData = 1; (line 763) in ContinuousModel.mm with a EXC_BAD_ACCESS signal.

October 17, 2011 at 4:52 pm #7711

ksamurai

I am downloading the old version of OpenEars to see if the old set of instructions will help. Will keep you posted of if I get it to work or not.

October 17, 2011 at 4:53 pm #7712

Halle

Did you download the .902 version that has the old xcode 3 instructions to use as a helper for that process? I don’t really support Xcode 3 anymore and it’s been a while since it was current, so it might be a good idea to consider upgrading since 4 is free and has some nice new stuff.

[EDIT: I see you are downloading the old version.]

I actually preferred the UI of 3 myself, but at this point it causes me more trouble than its worth to be out of step with the current version.

  • This reply was modified 214 days ago by  Halle. Reason: New info slipped
October 17, 2011 at 4:54 pm #7713

Halle

No I mean turn on OPENEARSLOGGING and VERBOSEPOCKETSPHINX to use the library’s built-in logging. Which you know from reading the instructions :) .

October 17, 2011 at 5:13 pm #7715

ksamurai

Ah, I didn’t realize this is what you were referring to. That is not mentioned until the “Using OpenEars In Your App” section, and so I did not make the immediate connection to use it in the testing of the sample program. Silly, I know. :)

So, first things first, the Sample Project runs fine on the Simulator. On my iTouch, I get the bad access, and I believe this would be the relevant part of the log:
2011-10-17 10:12:23.062 OpenEarsSampleProject[2543:5d03] OPENEARSLOGGING: Starting openAudioDevice on the device.
2011-10-17 10:12:23.071 OpenEarsSampleProject[2543:5d03] OPENEARSLOGGING: Audio unit wrapper successfully created.
2011-10-17 10:12:23.086 OpenEarsSampleProject[2543:5d03] OPENEARSLOGGING: Couldn’t initialize audio unit: -12986
2011-10-17 10:12:23.095 OpenEarsSampleProject[2543:5d03] OPENEARSLOGGING: openAudioDevice failed

Also, I have gone this far (up to getting the sample program running), looking at both the Xcode 3 and 4 instructions, and have done everything mentioned. I have considered upgrading to Xcode 4, but I do prefer the Xcode 3 interface, and since I have no requirements at this point that require me to upgrade, I am hesitant to do so.

October 17, 2011 at 5:31 pm #7716

Halle

It’s also mentioned in the post above this one that says “Please read before you post” which might still be worth a read since it would have directed you to the AudioSessionManager and related FAQ entries to your issue as well.

So, yes, you can’t do speech recognition on a device that has no audio input. If you were writing a speech recognition app you could require a mic in supported hardware in your info.plist, but since you’re just trying it out all you have to do is comment the calls in the sample app to PocketsphinxController.

October 17, 2011 at 5:35 pm #7717

ksamurai

:S I sincerely apologize for having overlooked that post!

October 17, 2011 at 5:36 pm #7718

Halle

No prob!

October 17, 2011 at 6:27 pm #7719

ksamurai

So here is my current state of affairs:
I added the AudioSessionManager to my code in the following way:
In MyAppDelegate.h

@class RootViewController;
@class AudioSessionManager;

@interface MyLanguagesAppDelegate : NSObject {
UIWindow *window;
RootViewController *viewController;
AudioSessionManager *myAudioSessionManager;
}

@property (nonatomic, retain) UIWindow *window;
@property (nonatomic, retain) AudioSessionManager* myAudioSessionManager;
@end

In MyAppDelegate.m

...
#import "AudioSessionManager.h"
...
@synthesize myAudioSessionManager;
...
- (void) applicationDidFinishLaunching:(UIApplication*)application
{
...
[self.myAudioSessionManager startAudioSession];
}

- (AudioSessionManager*)myAudioSessionManager {
if (myAudioSessionManager == nil) {
myAudioSessionManager = [[AudioSessionManager alloc] init];
}
return myAudioSessionManager;
}

- (void)dealloc {
[myAudioSessionManager release];
[[CCDirector sharedDirector] release];
[window release];
[super dealloc];
}

As soon as I added the line: [self.myAudioSessionManager startAudioSession];
all of my sound goes away.

If I comment out all of the AudioSessionManager code, I hear the Flite voices just fine, and oddly enough they seem much louder than yesterday, though I did not change the code or settings in that particular project….

October 17, 2011 at 6:33 pm #7720

Halle

BTW, how did you hear any speech out the speaker of an iPod Touch 1G? Isn’t it a piezo?

October 17, 2011 at 6:36 pm #7721

Halle

Do you have other audio session code in your app (check by searching case-insensitively for the string audiosession)? Can you please verify that this issue is occurring or not occurring with the sample app?

October 17, 2011 at 6:37 pm #7722

Halle

And again, please post all logging output from OPENEARSLOGGING.

October 17, 2011 at 6:44 pm #7723

ksamurai

My mistake, it is a gen2 iTouch.
I have no audio session in my code, but I am also using Cocos2d, so I don’t know if it is overriding something.
I ran the sample app (after commenting out all PocketsphinxController code). I do not hear anything when I load up the app.

Let me post the logging now.

October 17, 2011 at 6:46 pm #7724

ksamurai

This is the log without the AudioSessionManager:
2011-10-17 11:46:15.365 MyLanguages[2756:307] OPENEARSLOGGING: I’m running flite
2011-10-17 11:46:15.519 MyLanguages[2756:307] OPENEARSLOGGING: I’m done running flite and it took 0.147494 seconds
2011-10-17 11:46:15.528 MyLanguages[2756:307] OPENEARSLOGGING: Flite audio player was nil when referenced so attempting to allocate a new audio player.
2011-10-17 11:46:15.542 MyLanguages[2756:307] OPENEARSLOGGING: Loading speech data for Flite concluded successfully.
2011-10-17 11:46:15.806 MyLanguages[2756:307] OPENEARSLOGGING: Flite sending suspend recognition notification.
2011-10-17 11:46:16.837 MyLanguages[2756:307] OPENEARSLOGGING: AVAudioPlayer did finish playing with success flag of 1
2011-10-17 11:46:16.994 MyLanguages[2756:307] OPENEARSLOGGING: Flite sending resume recognition notification.

October 17, 2011 at 6:51 pm #7725

ksamurai

And here it is with AudioSessionManager:
2011-10-17 11:51:52.238 MyLanguages[2770:307] OPENEARSLOGGING: The audio session has never been initialized so we will do that now.
2011-10-17 11:51:52.263 MyLanguages[2770:307] OPENEARSLOGGING: Error 1852794999: Unable to set the audio session active.
2011-10-17 11:51:52.275 MyLanguages[2770:307] OPENEARSLOGGING: There is no audio input available.
2011-10-17 11:51:52.287 MyLanguages[2770:307] OPENEARSLOGGING: AudioSessionManager startAudioSession has reached the end of the initialization.
2011-10-17 11:51:52.300 MyLanguages[2770:307] OPENEARSLOGGING: Exiting startAudioSession.
2011-10-17 11:51:52.542 MyLanguages[2770:307] cocos2d: Frame interval: 1
2011-10-17 11:51:52.576 MyLanguages[2770:307] cocos2d: surface size: 320×480
2011-10-17 11:52:04.668 MyLanguages[2770:307] click
2011-10-17 11:52:04.755 MyLanguages[2770:307] OPENEARSLOGGING: I’m running flite
2011-10-17 11:52:04.906 MyLanguages[2770:307] OPENEARSLOGGING: I’m done running flite and it took 0.145568 seconds
2011-10-17 11:52:04.913 MyLanguages[2770:307] OPENEARSLOGGING: Flite audio player was nil when referenced so attempting to allocate a new audio player.
2011-10-17 11:52:04.922 MyLanguages[2770:307] OPENEARSLOGGING: Loading speech data for Flite concluded successfully.
2011-10-17 11:52:04.976 MyLanguages[2770:307] OPENEARSLOGGING: Flite sending suspend recognition notification.

October 17, 2011 at 7:16 pm #7726

Halle

OK, the audio session error is “kAudioServicesNoHardwareError”. Actually, I had another report of audio sessions failing on the iPod Touch 2G for this exact reason and I’ve also seen a complaint about it in an app review.

I will enter this as a bug for this hardware. What you could do to help (since I don’t have an iPod Touch 2G) is to go through startAudioSession in AudioSessionManager and see if the audio session can start up if any of the chunks are commented out, then let me know which chunk was responsible. Thanks!

October 17, 2011 at 7:24 pm #7729

ksamurai

Okay, I can most certainly do that. I will post the results here as soon as I get a moment to go through it.

October 17, 2011 at 7:34 pm #7730

Halle

Awesome, danke.

October 17, 2011 at 8:02 pm #7731

ksamurai

Okay, so it looks like I found the main issue area:
If I comment out
UInt32 audioCategory = kAudioSessionCategory_PlayAndRecord; // Set the Audio Session category to kAudioSessionCategory_PlayAndRecord.
OSStatus audioCategoryStatus = AudioSessionSetProperty(kAudioSessionProperty_AudioCategory, sizeof(audioCategory), &audioCategory);
if (audioCategoryStatus != 0) {
OpenEarsLog(@”Error %d: Unable to set audio category.”, (int)audioCategoryStatus);
}

Then I get the error messages from the next two “if’s”, so I also have to comment them out:
UInt32 bluetoothInput = 1;
OSStatus bluetoothInputStatus = AudioSessionSetProperty(kAudioSessionProperty_OverrideCategoryEnableBluetoothInput,sizeof (bluetoothInput), &bluetoothInput);
if (bluetoothInputStatus != 0) {
OpenEarsLog(@”Error %d: Unable to set bluetooth input.”, (int)bluetoothInputStatus);
}

UInt32 overrideCategoryDefaultToSpeaker = 1; // Re-route sound output to the main speaker.
OSStatus overrideCategoryDefaultToSpeakerError = AudioSessionSetProperty (kAudioSessionProperty_OverrideCategoryDefaultToSpeaker, sizeof (overrideCategoryDefaultToSpeaker), &overrideCategoryDefaultToSpeaker);
if (overrideCategoryDefaultToSpeakerError != 0) {
OpenEarsLog(@”Error %d: Unable to override the default speaker.”, (int)overrideCategoryDefaultToSpeakerError);
}

After commenting all this out the following log is the result (see my next post) and sound plays.

October 17, 2011 at 8:02 pm #7732

ksamurai

2011-10-17 12:59:46.264 MyLanguages[2890:307] OPENEARSLOGGING: The audio session has never been initialized so we will do that now.
2011-10-17 12:59:46.301 MyLanguages[2890:307] OPENEARSLOGGING: There is no audio input available.
2011-10-17 12:59:46.313 MyLanguages[2890:307] OPENEARSLOGGING: AudioSessionManager startAudioSession has reached the end of the initialization.
2011-10-17 12:59:46.324 MyLanguages[2890:307] OPENEARSLOGGING: Exiting startAudioSession.

October 17, 2011 at 8:05 pm #7733

ksamurai

Here is the log when I play a sound:
2011-10-17 12:59:46.264 MyLanguages[2890:307] OPENEARSLOGGING: The audio session has never been initialized so we will do that now.
2011-10-17 12:59:46.301 MyLanguages[2890:307] OPENEARSLOGGING: There is no audio input available.
2011-10-17 12:59:46.313 MyLanguages[2890:307] OPENEARSLOGGING: AudioSessionManager startAudioSession has reached the end of the initialization.
2011-10-17 12:59:46.324 MyLanguages[2890:307] OPENEARSLOGGING: Exiting startAudioSession.
2011-10-17 13:00:02.964 MyLanguages[2890:307] OPENEARSLOGGING: I’m running flite
2011-10-17 13:00:03.127 MyLanguages[2890:307] OPENEARSLOGGING: I’m done running flite and it took 0.156835 seconds
2011-10-17 13:00:03.134 MyLanguages[2890:307] OPENEARSLOGGING: Flite audio player was nil when referenced so attempting to allocate a new audio player.
2011-10-17 13:00:03.141 MyLanguages[2890:307] OPENEARSLOGGING: Loading speech data for Flite concluded successfully.
2011-10-17 13:00:03.972 MyLanguages[2890:307] OPENEARSLOGGING: Flite sending suspend recognition notification.
2011-10-17 13:00:05.264 MyLanguages[2890:307] OPENEARSLOGGING: AVAudioPlayer did finish playing with success flag of 1
2011-10-17 13:00:05.424 MyLanguages[2890:307] OPENEARSLOGGING: Flite sending resume recognition notification.

October 17, 2011 at 8:19 pm #7734

Halle

OK, maybe it hates kAudioSessionCategory_PlayAndRecord because it can’t record. Can you uncomment everything and now try replacing kAudioSessionCategory_PlayAndRecord with either of the following and see if you get desired results (keeping in mind that this would prevent speech recognition on any device):

AVAudioSessionCategoryPlayback

or its equivalent

kAudioSessionCategory_MediaPlayback

October 17, 2011 at 8:29 pm #7735

ksamurai

Okay, I replaced with kAudioSessionCategory_MediaPlayback. There are a couple log errors, but I get sound:
2011-10-17 13:27:57.851 MyLanguages[2907:307] OPENEARSLOGGING: The audio session has never been initialized so we will do that now.
2011-10-17 13:27:57.873 MyLanguages[2907:307] OPENEARSLOGGING: Error 2003329396: Unable to set bluetooth input.
2011-10-17 13:27:57.893 MyLanguages[2907:307] OPENEARSLOGGING: Error 2003329396: Unable to override the default speaker.
2011-10-17 13:27:58.021 MyLanguages[2907:307] OPENEARSLOGGING: There is no audio input available.
2011-10-17 13:27:58.034 MyLanguages[2907:307] OPENEARSLOGGING: AudioSessionManager startAudioSession has reached the end of the initialization.
2011-10-17 13:27:58.046 MyLanguages[2907:307] OPENEARSLOGGING: Exiting startAudioSession.
2011-10-17 13:28:19.235 MyLanguages[2907:307] OPENEARSLOGGING: I’m running flite
2011-10-17 13:28:19.398 MyLanguages[2907:307] OPENEARSLOGGING: I’m done running flite and it took 0.155552 seconds
2011-10-17 13:28:19.404 MyLanguages[2907:307] OPENEARSLOGGING: Flite audio player was nil when referenced so attempting to allocate a new audio player.
2011-10-17 13:28:19.413 MyLanguages[2907:307] OPENEARSLOGGING: Loading speech data for Flite concluded successfully.
2011-10-17 13:28:20.248 MyLanguages[2907:307] OPENEARSLOGGING: Flite sending suspend recognition notification.
2011-10-17 13:28:21.665 MyLanguages[2907:307] OPENEARSLOGGING: AVAudioPlayer did finish playing with success flag of 1
2011-10-17 13:28:21.825 MyLanguages[2907:307] OPENEARSLOGGING: Flite sending resume recognition notification.

October 17, 2011 at 8:30 pm #7736

ksamurai

If I try to use AVAudioSessionCategoryPlayback, the app does not even compile.

October 17, 2011 at 9:01 pm #7737

Halle

And how are your volume levels with speech when using kAudioSessionCategory_MediaPlayback?

October 17, 2011 at 9:06 pm #7738

ksamurai

It seems to be louder now. Thanks very much for all of your prompt replies and for all the help!

October 17, 2011 at 9:22 pm #7739

Halle

All righty, I will add that fix when I have time to test it more extensively. Thanks for the bug report and testing.

October 17, 2011 at 9:23 pm #7740

ksamurai

No problem!

Viewing 35 posts - 1 through 35 (of 35 total)

You must be logged in to reply to this topic.