OE and Bluetooth HFP 8kHz Audio signal

This topic has 13 replies, 4 voices, and was last updated 7 years, 5 months ago by Halle Winkler.

Viewing 14 posts - 1 through 14 (of 14 total)

Advertisement: “Rejecto is a plugin for OpenEars™ and RapidEars that lets you ignore speech that isn't in your vocabulary!”

Author

Posts
July 31, 2016 at 12:18 am #1030758

tbrandt78
Participant

Hello Halle –

Thanks again for such a great framework that you have introduced and support for the iOS community. That being said, we built a hands-free iOS app using it for basic command recognition and I have now been trying to hook it up the Bluetooth.

My experience with Bluetooth in my car is that the BluetoothHFP profile uses an 8kHz connection for the microphone. I have found that OpenEars struggles to work with this low-fidelity audio signal. Further investigation of the code seems to suggest that OE wants at least a 16kHz audio input.
Is that true? Is there anyway to make OE work well with the 8kHz connection. Or, is there any tricks on iOS to get a wider bandwidth connection that I may not be aware of? Thank you for any advice or additional information that you may be able to offer.

Thanks,
Tim

July 31, 2016 at 9:18 am #1030760

Halle Winkler
Politepix

Hi Tim,

Yes, I added a few audio session override properties in the last version for experimenting with improving bluetooth under these kinds of conditions. In the OEPocketsphinxController docs check out the new properties disablePreferredSampleRate, disablePreferredBufferSize, and disablePreferredChannelNumber to see if any one of them or a combination helps with your situation. I would recommend confirming that everything is working fine in a non-bluetooth implementation, then trying them one at a time, then trying them in groups of two, then all together, all with several repetitions to avoid fluke results, and document your results and choose the least-intrusive combination that helps (if any do). It is better to override these settings as little as possible.

August 1, 2016 at 6:45 pm #1030768

tbrandt78
Participant

Hi Halle –

Okay, I will be looking into this, today. Thank You for the advice.
I am guessing that disablePreferredSampleRate might do the trick is it removes the requirement for 16kHz audio input. I will see how it goes.

Thank You,
Tim

August 2, 2016 at 8:18 pm #1030771

tbrandt78
Participant

Hey Halle –

So, it seems that none of these settings will help my case, as far as I can tell. These settings all affect how OEContinuousAudioUnit::setAllAudioSessionSettings sets the AVAudioSession.

From what I can tell, disablePreferredSampleRate just stop OE from attempting to override the sample rate using this AVAudioSession command :
[AVAudioSession sharedInstance].preferredSampleRate

The issue that I have observed from the automotive BluetoothHFP systems is that the audio input all seems to be 8kHz. I have already tried to set preferredSampleRate to 16000, but it always results in the sample rate staying at 8kHz.

So… I really believe that the answer here is to ‘up-sample’ the input audio signal when 8kHz to 16kHz for OpenEars to take, cleanly. Yes, the captured audio will still have 8kHz resolution, but the bit depth will be at 16kHz which seems more compatible for OpenEars.

Thoughts?

August 2, 2016 at 9:24 pm #1030773

Halle Winkler
Politepix

What should happen, if everything on the device and Apple side is working according to its documented behavior, is that it will be upsampled automatically in the render callback buffer even if the preferred rate is unsettable or overridden. However, this will not add any information to the audio data (nothing will, and it might also be compressed when it first comes into the callback for a further information reduction), and IIRC it may also result in the overall volume of the buffered data being reduced. So, to a certain extent, I would say that the results may not be the preferred outcome but may be as expected – I’m afraid that ultimately the issue probably doesn’t originate in the engine but is just manifesting there. It is also not necessarily the case that the device and/or the audio API are working according to their documented behavior, which is why bluetooth is only supported experimentally. Sorry I can’t help out more with this.

Edit: however, since there could be a change in volume due to sampling rate change, please make sure that your not-great results aren’t due to a need to adjust the vadThreshold and/or Rejecto weighting (if used, and only after clarifying vadThreshold setting) in one direction or the other. Otherwise, definitely no other thoughts.

August 3, 2016 at 7:21 pm #1030779

tbrandt78
Participant

Okay, thanks for the feedback.
At the moment, I am switching between the internal phone microphone input and bluetooth speaker output. That seems to work reasonably well, but not ideal.

Basically, I stay on category AVAudioSessionPlayAndRecord and switch the input port back and forth between HFP and the internal microphone.

This is not ideal, but it seems to be the best way to get it to work.
Otherwise, OE struggles recognizing commands using the BT microphone.

November 9, 2016 at 8:19 am #1031272

dannychen
Participant

Hi, I got a same issue.
Everything works fine on builtInMic, but the accuracy get highly decreased when I connected with bluetooth headset.
After read the source code of pocketsphinx and OpenEars and did some test.
I’m figuring, if we connected with 8k sample rate bluetooth device. Should we add “-samprate 8000” when we init the decoder?

Here is the result of my test:
Input: 63 wav files with 9 grammar (8k sample rate).

case 1: without “-samprate 8000”
TOTAL Words: 63 Correct: 9 Errors: 54
TOTAL Percent correct = 14.29% Error = 85.71% Accuracy = 14.29%

case 2: with “-samprate 8000”
TOTAL Words: 63 Correct: 58 Errors: 5
TOTAL Percent correct = 92.06% Error = 7.94% Accuracy = 92.06%

Thank you.

November 10, 2016 at 1:20 pm #1031282

Halle Winkler
Politepix

Interesting, I was under the impression that current Pocketsphinx didn’t use that mode at all any more. I’ll add an API hook for it in the meantime and consider longer term approaches to this problem. API support for 8kHz mode will be along in the next couple of days along with the build command fix.

November 15, 2016 at 6:08 pm #1031289

Halle Winkler
Politepix

OK, I’ve exposed API for running in 8k mode in today’s update 2.504, you can set use8kMode in OEPocketsphinxController. In the longer term I’ll think about whether this can be handled automatically.

November 23, 2016 at 5:42 am #1031305

dannychen
Participant

It works for me with 8k sample rate bluetooth headset.
Thanks!

November 23, 2016 at 11:39 am #1031306

BergeracMatt
Participant

Hi Halle, Would you be able to extend the use8kMode to SaveThatWave? As it is, with use8kMode set, wav files are still being created at 16kHz.
Thanks
Matt

November 23, 2016 at 11:54 am #1031308

Halle Winkler
Politepix

Hi Matt,

Sorry, it is a bluetooth compatibility mode accommodation only (and probably will not even stick around very long as an OpenEars API at all if I get a sense of how to automate the bluetooth compatibility accommodations without side-effects). Format conversions of SaveThatWave output isn’t an area I’m likely to get into since the number of potential output variations developers could make use of is large and SaveThatWave isn’t trying to be a general-purpose audio tool. However, this is not difficult for you to take on in the way that meets your needs the best as part of your own developer scope – the best high-level API to investigate starting out with is probably the AVAsset family.

November 23, 2016 at 1:12 pm #1031309

BergeracMatt
Participant

Thanks Halle for the quick reply like always. I’ve already written the AVAsset routines to downsample the SaveThatWave output to 8k but I was hoping to not need them.
I didn’t think it would be that big a deal for you to retain the Openears sampling rate through SaveThatWave – I didn’t think you would need to – or should, cater for any variations – simply prevent SaveThatWave from upsampling.
I don’t at all like the loss of fidelity with Openears sampling at 8kHz, SaveThatWave upsampling to 16kHz, then AVAsset downsampling to 8kHz.

My cloud recognition accuracy was extremely low when I was sending 16kHz wav files that contained 8khz bluetooth audio. Recognition rates were better after downsampling the files to 8kHz but nowhere near as good as when I record the audio myself at 8kHz with AVAudioSession

Also – just in case you didn’t know, this 8khz mode should not be applied to all bluetooth connections. Bluetooth HFP 1.6 supports “wideband audio” aka “HD audio” which is sampled at 16kHz. HFP 1.6 was specced around 2012. If both sides support HFP 1.6 or 1.7, then a 16kHz recording should be used.

November 23, 2016 at 2:18 pm #1031310

Halle Winkler
Politepix

Hi Matt,

This sounds a bit like it might be a misdiagnosis of the drop in accuracy rate with your cloud service – it wouldn’t be obvious to me how a 16-bit/16k PCM format is removing that much needed data from 8k speech audio, so it might be a case for trying different approaches.

Be that as it may, Pocketsphinx’s 8k sample rate mode doesn’t make any changes to how OpenEars manages its audio buffers (or it wouldn’t have been possible to add right now) and SaveThatWave is ignorant of devices and Pocketsphinx by design, so trying to make SaveThatWave aware of Pocketsphinx runtime settings or the audio driver and branching its behavior isn’t a probable direction for development, sorry.
Author

Posts

Viewing 14 posts - 1 through 14 (of 14 total)

The topic ‘OE and Bluetooth HFP 8kHz Audio signal’ is closed to new replies.