Tutorials for the OpenEars Platform

To get your own customized tutorial showing you how to use any feature of OpenEars or its plugins in your app, just flip the switches for the features you want. Anywhere that you see the icon you can click it to have the code which follows it copied to your clipboard so you can paste it into your app. Remember, never test recognition on the Simulator since it doesn't use the real OpenEars audio driver.

Show me a 5-minute tutorial for an OpenEars-enabled app with the following features:

Preparing to use OpenEars

1

• Create your own app. Download the OpenEars distribution and unpack it.
• Inside your downloaded OpenEars distribution there is a folder called "Framework". Drag that folder into your app project in Xcode. Make absolutely sure that in the add dialog, "Create groups for any added folders" is selected and NOT "Create folder references for any added folders", and that "Copy items into destination group's folder (if needed)" is selected. The wrong settings here will prevent your app from working. Xcode has some quirks when it comes to adding framework search paths, so if you receive errors when building that header files can't be found, take a look at your build settings "Framework Search Path" and make sure that Xcode did not add any impossible entries there.
• Optional: to save space in your app binary, go to the build settings for your app target and search for the setting "Deployment Postprocessing" and set it to "Yes".
• Add the iOS frameworks AudioToolbox and AVFoundation to your app.

Preparing to use Plugins

1

First download demo versions of your plugin or plugins:
Then open up the Build Settings tab of the target app project and find the entry "Other Linker Flags" and add the linker flag "-ObjC":
And then drag your downloaded demo framework into your app project.

Using OELanguageModelGenerator

1

In offline speech recognition, you define the vocabulary that you want your app to be able to recognize. This is called a language model or grammar (you can read more about these options in the OELanguageModelGenerator documentation). A good vocabulary size for an offline speech recognition app on the iPhone, iPod or iPad is between 10 and 500 words. Add the following to your implementation (the .m file): Under the @implementation keyword at the top:
#import <OpenEars/OELanguageModelGenerator.h>

2

In the method where you want to create your language model (for instance your viewDidLoad method), add the following method call (replacing the placeholders like "WORD" and "A PHRASE" with actual words and phrases you want to be able to recognize):

OELanguageModelGenerator *lmGenerator = [[OELanguageModelGenerator alloc] init];

NSArray *words = [NSArray arrayWithObjects:@"WORD", @"STATEMENT", @"OTHER WORD", @"A PHRASE", nil];
NSString *name = @"NameIWantForMyLanguageModelFiles";
NSError *err = [lmGenerator generateLanguageModelFromArray:words withFilesNamed:name forAcousticModelAtPath:[OEAcousticModel pathToModel:@"AcousticModelEnglish"]]; // Change "AcousticModelEnglish" to "AcousticModelSpanish" to create a Spanish language model instead of an English one.

NSString *lmPath = nil;
NSString *dicPath = nil;
	
if(err == nil) {
		
	lmPath = [lmGenerator pathToSuccessfullyGeneratedLanguageModelWithRequestedName:@"NameIWantForMyLanguageModelFiles"];
	dicPath = [lmGenerator pathToSuccessfullyGeneratedDictionaryWithRequestedName:@"NameIWantForMyLanguageModelFiles"];
		
} else {
	NSLog(@"Error: %@",[err localizedDescription]);
}
It is a requirement to enter your words and phrases in all capital letters, since the model is generated against a dictionary in which the entries are capitalized (meaning that if the words in the array aren't capitalized, they will not match the dictionary and you will not have the widest variety of pronunciations understood for the word you are using).

Using OEPocketsphinxController

1

To use OEPocketsphinxController, the class which performs speech recognition, you need a language model and a phonetic dictionary for it. These files define which words OEPocketsphinxController is capable of recognizing. We just created them above by using OELanguageModelGenerator. You also need an acoustic model. OpenEars ships with an English and a Spanish acoustic model.

First, add the following to your implementation (the .m file): Under the @implementation keyword at the top:
#import <OpenEars/OEPocketsphinxController.h>
#import <OpenEars/OEAcousticModel.h>

2

In the method where you want to recognize speech (to test this out, add it to your viewDidLoad method), add the following method call:
[[OEPocketsphinxController sharedInstance] setActive:TRUE error:nil];
[[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:lmPath dictionaryAtPath:dicPath acousticModelAtPath:[OEAcousticModel pathToModel:@"AcousticModelEnglish"] languageModelIsJSGF:NO]; // Change "AcousticModelEnglish" to "AcousticModelSpanish" to perform Spanish recognition instead of English.

Using OEFliteController

1

To use OEFliteController, you need to have at least one Flite voice added to your project. When you added the "framework" folder of OpenEars to your app, you already imported a voice called Slt, so these instructions will use the Slt voice. You can get eight more free voices in OpenEarsExtras, available at https://bitbucket.org/Politepix/openearsextras

2

Add the following lines to your header (the .h file). Under the imports at the very top:
#import <Slt/Slt.h>
#import <OpenEars/OEFliteController.h>
Add these class properties to the other properties of your view controller or object:
@property (strong, nonatomic) OEFliteController *fliteController;
@property (strong, nonatomic) Slt *slt;

3

Add the following to your implementation (the .m file): Before you want to use TTS speech in your app, instantiate an OEFliteController and a voice as follows (perhaps in your view controller's viewDidLoad method):
		self.fliteController = [[OEFliteController alloc] init];
		self.slt = [[Slt alloc] init];

4

After having initialized your OEFliteController, add the following message in a method where you want to call speech:
[self.fliteController say:@"A short statement" withVoice:self.slt];

Using OEEventsObserver

1

OEEventsObserver is the class which keeps you continuously updated about the status of your listening session, among other things, via delegate callbacks. Add the following lines to your header (the .h file). Under the imports at the very top:
#import <OpenEars/OEEventsObserver.h>
at the @interface declaration, add the OEEventsObserverDelegate inheritance. An example of this for a view controller called ViewController would look like this:
@interface ViewController : UIViewController <OEEventsObserverDelegate> {
And add this property to your other class properties (OEEventsObserver must be a property of your class or it will not work):
@property (strong, nonatomic) OEEventsObserver *openEarsEventsObserver;

2

Add the following to your implementation (the .m file): Before you call a method of either OEFliteController or OEPocketsphinxController (perhaps in viewDidLoad), instantiate OEEventsObserver and set its delegate as follows:
self.openEarsEventsObserver = [[OEEventsObserver alloc] init];
[self.openEarsEventsObserver setDelegate:self];

3

Add these delegate methods of OEEventsObserver to your class, which is where you will receive information about received speech hypotheses and other speech UI events:
- (void) pocketsphinxDidReceiveHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore utteranceID:(NSString *)utteranceID {
	NSLog(@"The received hypothesis is %@ with a score of %@ and an ID of %@", hypothesis, recognitionScore, utteranceID);
}

- (void) pocketsphinxDidStartListening {
	NSLog(@"Pocketsphinx is now listening.");
}

- (void) pocketsphinxDidDetectSpeech {
	NSLog(@"Pocketsphinx has detected speech.");
}

- (void) pocketsphinxDidDetectFinishedSpeech {
	NSLog(@"Pocketsphinx has detected a period of silence, concluding an utterance.");
}

- (void) pocketsphinxDidStopListening {
	NSLog(@"Pocketsphinx has stopped listening.");
}

- (void) pocketsphinxDidSuspendRecognition {
	NSLog(@"Pocketsphinx has suspended recognition.");
}

- (void) pocketsphinxDidResumeRecognition {
	NSLog(@"Pocketsphinx has resumed recognition."); 
}

- (void) pocketsphinxDidChangeLanguageModelToFile:(NSString *)newLanguageModelPathAsString andDictionary:(NSString *)newDictionaryPathAsString {
	NSLog(@"Pocketsphinx is now using the following language model: \n%@ and the following dictionary: %@",newLanguageModelPathAsString,newDictionaryPathAsString);
}

- (void) pocketSphinxContinuousSetupDidFailWithReason:(NSString *)reasonForFailure {
	NSLog(@"Listening setup wasn't successful and returned the failure reason: %@", reasonForFailure);
}

- (void) pocketSphinxContinuousTeardownDidFailWithReason:(NSString *)reasonForFailure {
	NSLog(@"Listening teardown wasn't successful and returned the failure reason: %@", reasonForFailure);
}

- (void) testRecognitionCompleted {
	NSLog(@"A test file that was submitted for recognition is now complete.");
}

Using OELanguageModelGenerator+RuleORama

1

First, find the line
#import <OpenEars/OELanguageModelGenerator.h>
in your app and add the following line right underneath it:
#import <RuleORamaDemo/OELanguageModelGenerator+RuleORama.h>
Next, change this line where you create a language model:
NSError *err = [lmGenerator generateLanguageModelFromArray:words withFilesNamed:name forAcousticModelAtPath:[OEAcousticModel pathToModel:@"AcousticModelEnglish"]];
to use this grammar and method instead:

NSDictionary *grammar = @{
     ThisWillBeSaidOnce : @[
         @{ OneOfTheseCanBeSaidOnce : @[@"HELLO COMPUTER", @"GREETINGS ROBOT"]},
         @{ OneOfTheseWillBeSaidOnce : @[@"DO THE FOLLOWING", @"INSTRUCTION"]},
         @{ OneOfTheseWillBeSaidOnce : @[@"GO", @"MOVE"]},
         @{ThisWillBeSaidOnce : @[
             @{ OneOfTheseWillBeSaidOnce : @[@"10", @"20",@"30"]}, 
             @{ OneOfTheseWillBeSaidOnce : @[@"LEFT", @"RIGHT", @"FORWARD"]}
         ]},
         @{ ThisCanBeSaidOnce : @[@"THANK YOU"]}
     ]
 };
 
    NSError *err = [lmGenerator generateFastGrammarFromDictionary:grammar withFilesNamed:name forAcousticModelAtPath:[OEAcousticModel pathToModel:@"AcousticModelEnglish"]];


This will allow you to recognize statements in accordance with this grammar, such as: HELLO COMPUTER DO THE FOLLOWING MOVE 10 LEFT THANK YOU or GREETINGS ROBOT INSTRUCTION MOVE 20 RIGHT but it will not recognize individual words or words in orders outside of the grammar. Please note that unlike the JSGF output type in stock OpenEars, RuleORama doesn't support the rule types with optional repetitions. Rules defined with repetitions will be composed into a rule with a single repetition. You can learn much more about how grammars work in OpenEars and RuleORama here.

Using OELanguageModelGenerator+Rejecto

1

First, find the line
#import <OpenEars/OELanguageModelGenerator.h>
in your app and add the following line right underneath it:
#import <RejectoDemo/OELanguageModelGenerator+Rejecto.h>
Next, change this line where you create a language model:
NSError *err = [lmGenerator generateLanguageModelFromArray:words withFilesNamed:name forAcousticModelAtPath:[OEAcousticModel pathToModel:@"AcousticModelEnglish"]];
to use this method instead:

NSError *err = [lmGenerator generateRejectingLanguageModelFromArray:words
 withFilesNamed:name 
 withOptionalExclusions:nil
 usingVowelsOnly:FALSE 
 withWeight:nil 
 forAcousticModelAtPath:[OEAcousticModel pathToModel:@"AcousticModelEnglish"]]; // Change "AcousticModelEnglish" to "AcousticModelSpanish" to create a Spanish Rejecto model.

You will use the same array for languageModelArray (again, the words and phrases in your array must be in all-capital letters) and the same files name for fileName as you did with the old generateLanguageModelFromArray method, and to get started you can use the value "nil" for optionalExclusions, vowelsOnly, and weight, since they are there to help you refine your results and might not be needed. You can learn more about fine-tuning your results with those optional parameters in the Rejecto documentation.

Using OEPocketsphinxController+RapidEars

1

Like OEPocketsphinxController which it extends, we need a language model created with OELanguageModelGenerator before using OEPocketsphinxController+RapidEars. We have already completed that step above.

2

Add the following to your implementation (the .m file): Under the @implementation keyword at the top, after the line #import <OpenEars/OEPocketsphinxController.h>:
#import <RapidEarsDemo/OEPocketsphinxController+RapidEars.h>
Next, comment out all calls in your app to the method
startListeningWithLanguageModelAtPath:dictionaryAtPath:languageModelIsJSGF:
and in the same part of your app where you were formerly using this method, place the following:

[[OEPocketsphinxController sharedInstance] startRealtimeListeningWithLanguageModelAtPath:lmPath dictionaryAtPath:dicPath acousticModelAtPath:[OEAcousticModel pathToModel:@"AcousticModelEnglish"]]; // Starts the rapid recognition loop. Change "AcousticModelEnglish" to "AcousticModelSpanish" in order to perform Spanish language recognition.

If you find that sometimes you are getting live recognition and other times not, make sure that you have definitely replaced all instances of startListeningWithLanguageModelAtPath: with startRealtimeListeningWithLanguageModelAtPath:.

Using OEEventsObserver+RapidEars

1

At the top of your header after the line
#import <OpenEars/OEEventsObserver.h>
Add the line
#import <RapidEarsDemo/OEEventsObserver+RapidEars.h>
And after this OEEventsObserver delegate method you added to your implementation when setting up your OpenEars app:
- (void) testRecognitionCompleted {
	NSLog(@"A test file that was submitted for recognition is now complete.");
}
Just add the following extended delegate methods:
- (void) rapidEarsDidReceiveLiveSpeechHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore {
    NSLog(@"rapidEarsDidReceiveLiveSpeechHypothesis: %@",hypothesis);
}

- (void) rapidEarsDidReceiveFinishedSpeechHypothesis:(NSString *)hypothesis recognitionScore:(NSString *)recognitionScore {
    NSLog(@"rapidEarsDidReceiveFinishedSpeechHypothesis: %@",hypothesis);
}

Using OEFliteController+NeatSpeech

1

OEFliteController+NeatSpeech preconditions

In order to use NeatSpeech, as well as importing the framework into your OpenEars-enabled project, it is also necessary to import the voices and voice data files by dragging the "Voice" folder in the disk image into your app project (once your app is working you can read more here about how to remove the elements you don't need in order to have a small app binary size).

Very important: when you drag in the voices and framework folders, make sure that in Xcode's "Add" dialog, "Create groups for any added folders" is selected. Make sure that "Create folder references for any added folders" is not selected or your app will not work.

For the last step, change the name of the implementation source file in which you are going to call NeatSpeech methods from .m to .mm (for instance, if the implementation is named ViewController.m, change its name to ViewController.mm and verify in the Finder that the name of the file has changed) and then make sure that in your target Build Settings, under the section "C++ Standard Library", the setting "libstdc++ (Gnu C++ standard library)" is selected.

If you receive errors like "Undefined symbols for architecture i386: std::basic_ios >::widen(char) const", that means that this step needs special attention.

2

OEFliteController+NeatSpeech implementation

OEFliteController+Neatspeech simply replaces OEFliteController's voice type with the advanced NeatSpeech voice type, and it replaces OEFliteController's say:withVoice: method with NeatSpeech's sayWithNeatSpeech:withVoice: method.

In your header replace this:

#import <Slt/Slt.h>
#import <OpenEars/OEFliteController.h>
with this:
#import <Emma/Emma.h>
#import <OpenEars/OEFliteController.h>
#import <NeatSpeechDemo/OEFliteController+NeatSpeech.h>
and replace this:
Slt *slt;
with this:
Emma *emma;
and replace this:
@property (strong, nonatomic) Slt *slt;
with this:
@property (strong, nonatomic) Emma *emma;
in your implementation, replace this: and replace this:
	self.slt = [[Slt alloc] init];
with this:
self.emma = [[Emma alloc]initWithPitch:0.0 speed:0.0 transform:0.0];
and replace this:
[self.fliteController say:@"A short statement" withVoice:self.slt];
with this:
[self.fliteController sayWithNeatSpeech:@"Alice was getting very tired of sitting beside her sister on the bank, and having nothing to do: once or twice she had peeped into the book her sister was reading, but it had no pictures or conversations in it, and what is the use of a book, thought Alice, without pictures or conversations?" withVoice:self.emma];
And replace any other calls to say:withVoice with sayWithNeatSpeech:withVoice: Once this is definitely working you can remove the Slt or other Flite voice frameworks from your app to reduce app size. You can replace references to the Emma framework and object with any of the other voices to try them out.


The available voice frameworks you'll find in the Voices folder in the distribution are as follows:

Emma (US English, female)
EmmaAlternate (US English, female)
William (US English, male)
WilliamAlternate (US English, male)
Beatrice (UK English, female)
Elliott (UK English, make)
Daniel (Castilian Spanish, male)
Martina (Castilian Spanish, female)
Mateo (Latin American Spanish, male)
Valeria (Latin American Spanish, female)
You can also change the speaking speed, pitch of the voice, and inflection of each voice using the voice's initializer arguments speed, pitch and transform respectively. As an example, to initialize the Emma voice with a higher pitch you could use the following initialization: Emma *emma = [[Emma alloc]initWithPitch:0.2 speed:0.0 transform:0.0];

Once you know how your project is to be configured you can remove the unused voices following these instructions in order to make your app binary size as small as possible.

You can pass the sayWithNeatSpeech:withVoice: method as much data as you want at a time. It will process the speech in phases in the background and return it for playback once it is ready. This means that you should rarely experience long pauses while waiting for synthesis, even for very long paragraphs. Very long statements need to include pause indicators such as periods, exclamation points, question marks, commas, colons, semicolons, etc.

To interrupt ongoing speech while it is in progress, send the message [self.fliteController stopSpeaking];. This will not interrupt speech instantaneously but halt it at the next available opportunity.

Using OEEventsObserver+SaveThatWave

1

At the top of your header after the line
#import <OpenEars/OEEventsObserver.h>
Add the line
#import <SaveThatWaveDemo/OEEventsObserver+SaveThatWave.h>
And after this OEEventsObserver delegate method you added to your implementation when setting up your OpenEars app:
- (void) testRecognitionCompleted {
	NSLog(@"A test file that was submitted for recognition is now complete.");
}
Just add the following extended delegate method:
- (void) wavWasSavedAtLocation:(NSString *)location {
    NSLog(@"WAV was saved at the path %@", location);
    
}

Using SaveThatWaveController

1

Add the following lines to your header (the .h file). Under the imports at the very top:
#import <SaveThatWaveDemo/SaveThatWaveController.h>
Add this property to the other properties of your class (SaveThatWaveController has to be a property or it will not work):
@property (strong, nonatomic) SaveThatWaveController *saveThatWaveController;

2

Add the following to your implementation (the .m file): Under the @implementation keyword at the top, add this line:
@synthesize saveThatWaveController;
Before you start performing speech recognition, perhaps in your viewDidLoad method, instantiate the SaveThatWaveController property:
self.saveThatWaveController = [[SaveThatWaveController alloc] init];
Then, after this line for OpenEars:
[[OEPocketsphinxController sharedInstance] startListeningWithLanguageModelAtPath:lmPath dictionaryAtPath:dicPath acousticModelAtPath:[OEAcousticModel pathToModel:@"AcousticModelEnglish"] languageModelIsJSGF:NO];
or this line for RapidEars:
[[OEPocketsphinxController sharedInstance] startRealtimeListeningWithLanguageModelAtPath:lmPath dictionaryAtPath:dicPath acousticModelAtPath:[OEAcousticModel pathToModel:@"AcousticModelEnglish"];
You can add the line:
[self.saveThatWaveController start]; // For saving WAVs from OpenEars or RapidEars