OpenEars Swift Tutorials

Swift Tutorials for the OpenEars® Platform [Or, switch to the Objective-C Tutorials]

To get your own customized tutorial showing you how to use any feature of OpenEars or its plugins in your app, just flip the switches for the features you want. Anywhere that you see the icon you can click it to have the code which follows it copied to your clipboard so you can paste it into your app. Remember, never test recognition on the Simulator since it doesn't use the real OpenEars audio driver. If you encounter any issues, first turn on OELogging to view your full logging output.

Show me a 5-minute tutorial for an OpenEars-enabled app with the following features:

	OpenEars: Offline speech recognition
	OpenEars: Synthesized speech/text-to-speech
	RuleORama: Offline speech recognition which uses a rules-based grammar fast enough for RapidEars (paid plugin)
	Rejecto: Offline speech recognition which ignores spoken words which aren't in the vocabulary (paid plugin)
	RapidEars: Offline, live speech recognition which recognizes speech in realtime (paid plugin)
	NeatSpeech: Offline speech synthesis with better voices (paid plugin)
	SaveThatWave: Save audio from OpenEars offline speech recognition as WAV files (paid plugin)

Preparing to use OpenEars

1

• This tutorial requires Xcode 8.0 or later. Previous versions have bugs when adding Frameworks and aren't supported.

• Create your own app. Download the OpenEars distribution and unpack it.

• Inside your downloaded OpenEars distribution there is a folder called "Framework". Drag that folder into your app project in Xcode. Make absolutely sure that in the add dialog, "Create groups for any added folders" is selected and NOT "Create folder references for any added folders", and that "Copy items into destination group's folder (if needed)" is selected. The wrong settings here will prevent your app from working.

• Next, navigate to Xcode's Build Settings for your target and find the setting "Framework Search Paths". Double check that the Framework folder was added to it and that the path doesn't look peculiar (for instance, extraneous quotes or backslashes). It ought to be fine, but if it isn't, correct any observable peculiarities here, or add your path to your added Framework folder (non-recursive) if for some reason it is missing altogether. Repeat this check for any added Flite or Neatspeech voices (only applies if you are using those tutorial sections).

• Optional: to save space in your app binary, go to the build settings for your app target and search for the setting "Deployment Postprocessing" and set it to "Yes".

• Add the iOS frameworks AudioToolbox and AVFoundation to your app.

• Create your bridging header. With your project open, add a new file (File->New) of type "Header File". Name it OpenEarsHeader.h and make sure it is added to your app target. Add the following to it:

#ifndef OpenEarsHeader_h
#define OpenEarsHeader_h

#import <OpenEars/OEPocketsphinxController.h>
#import <OpenEars/OELanguageModelGenerator.h>
#import <OpenEars/OEFliteController.h>
#import <Slt/Slt.h> // Only needed if using Flite speech
#import <OpenEars/OEEventsObserver.h>
#import <OpenEars/OELogging.h>
#import <OpenEars/OEAcousticModel.h>

#endif /* OpenEarsHeader_h */

Then, in your build settings, find the setting “Objective-C Bridging Header” and enter the Finder path to this new header file. Your project will now be able to import OpenEars classes into your Swift source.

Preparing to use Plugins

1

• This tutorial requires Xcode 8.0 or later. Previous versions have bugs when adding Frameworks and aren't supported.

• First download demo versions of your plugin or plugins:

• Then open up the Build Settings tab of the target app project and find the entry "Other Linker Flags" and add the linker flag "-ObjC":

• Inside your downloaded demo distribution there is a folder called "framework". Drag that folder into your app project in Xcode. Make absolutely sure that in the add dialog, "Create groups for any added folders" is selected and NOT "Create folder references for any added folders", and that "Copy items into destination group's folder (if needed)" is selected. The wrong settings here will prevent your app from working.

• Next, navigate to Xcode's Build Settings for your target and find the setting "Framework Search Paths". Double check that the framework folder for your downloaded demo was added to it and that the path doesn't look peculiar (for instance, extraneous quotes or backslashes). It ought to be fine, but if it isn't, correct any observable peculiarities here, or add your path to your added framework folder (non-recursive) if for some reason it is missing altogether. Repeat this check for any added Neatspeech voices (only applies if you are using those tutorial sections).

Using OELanguageModelGenerator

1

In offline speech recognition, you define the vocabulary that you want your app to be able to recognize. This is called a language model or grammar (you can read more about these options in the OELanguageModelGenerator documentation). A good vocabulary size for an offline speech recognition app on the iPhone, iPod or iPad is between 10 and 1000 words. This is an example of a language model; examples of a grammar can be found in the documentation.

2

In the method where you want to create your language model (for instance your viewDidLoad method), add the following method call (replacing the placeholders like "word" and "A PHRASE" with actual words and phrases you want to be able to recognize):


let lmGenerator = OELanguageModelGenerator()
let words = ["word", "Statement", "other word", "A PHRASE"] // These can be lowercase, uppercase, or mixed-case.
let name = "NameIWantForMyLanguageModelFiles"
let err: Error! = lmGenerator.generateLanguageModel(from: words, withFilesNamed: name, forAcousticModelAtPath: OEAcousticModel.path(toModel: "AcousticModelEnglish"))

if(err != nil) {
	print("Error while creating initial language model: \(err)")   
} else {
	let lmPath = lmGenerator.pathToSuccessfullyGeneratedLanguageModel(withRequestedName: name) // Convenience method to reference the path of a language model known to have been created successfully.
	let dicPath = lmGenerator.pathToSuccessfullyGeneratedDictionary(withRequestedName: name) // Convenience method to reference the path of a dictionary known to have been created successfully.
}

Using OEPocketsphinxController

1

To use OEPocketsphinxController, the class which performs speech recognition, you need a language model and a phonetic dictionary for it. These files define which words OEPocketsphinxController is capable of recognizing. We just created them above by using OELanguageModelGenerator. You also need an acoustic model. OpenEars ships with an English model and has several other acoustic models for different languages here.

2

In the method where you want to recognize speech (to test this out, add it to your viewDidLoad method, inside of the success case for your language model generation), add the following method call:

// OELogging.startOpenEarsLogging() //Uncomment to receive full OpenEars logging in case of any unexpected results.
do {
	try OEPocketsphinxController.sharedInstance().setActive(true) // Setting the shared OEPocketsphinxController active is necessary before any of its properties are accessed.
	} catch {
		print("Error: it wasn't possible to set the shared instance to active: \"\(error)\"")
	}

	OEPocketsphinxController.sharedInstance().startListeningWithLanguageModel(atPath: lmPath, dictionaryAtPath: dicPath, acousticModelAtPath: OEAcousticModel.path(toModel: "AcousticModelEnglish"), languageModelIsJSGF: false)

Using OEFliteController

1

To use OEFliteController, you need to have at least one Flite voice added to your project. When you added the "framework" folder of OpenEars to your app, you already imported a voice called Slt, so these instructions will use the Slt voice.

2

Add to your view controller class:

var fliteController = OEFliteController()
var slt = Slt()

3

This will be how we perform text-to-speech within your app:

4

After having initialized your OEFliteController, add the following message to your viewDidLoad function:

self.fliteController.say(_:"A short statement.", with:self.slt)

Using OEEventsObserver

1

OEEventsObserver is the class which keeps you continuously updated about the status of your listening session, among other things, via delegate callbacks. class ViewController: UIViewController, OEEventsObserverDelegate { var slt = Slt() var openEarsEventsObserver = OEEventsObserver() Add the OEEventsObserverDelegate protocol to your class declaration at the top of your file, for instance if your class is called ViewController and its class declaration looks like this:

class ViewController: UIViewController {

change it to look like this:

class ViewController: UIViewController, OEEventsObserverDelegate {

At the beginning of your (for instance, ViewController) class, add the following var:

var openEarsEventsObserver = OEEventsObserver()

2

Before you call a method of either OEFliteController or OEPocketsphinxController (perhaps in viewDidLoad), instantiate OEEventsObserver and set its delegate as follows:

self.openEarsEventsObserver.delegate = self

3

Add these delegate methods of OEEventsObserver to your class, which is where you will receive information about received speech hypotheses and other speech UI events (there are additional callbacks of this type you can add – check the docs). Once these are added, you should be able to build and run and see local speech recognition results logged in the console (keeping in mind you will need to run in a session in which you've given a device mic permission):

    func pocketsphinxDidReceiveHypothesis(_ hypothesis: String!, recognitionScore: String!, utteranceID: String!) { // Something was heard
        print("Local callback: The received hypothesis is \(hypothesis!) with a score of \(recognitionScore!) and an ID of \(utteranceID!)") 
    }
       
    // An optional delegate method of OEEventsObserver which informs that the Pocketsphinx recognition loop has entered its actual loop.
    // This might be useful in debugging a conflict between another sound class and Pocketsphinx.
    func pocketsphinxRecognitionLoopDidStart() {
        print("Local callback: Pocketsphinx started.") // Log it.
    }
    
    // An optional delegate method of OEEventsObserver which informs that Pocketsphinx is now listening for speech.
    func pocketsphinxDidStartListening() {
        print("Local callback: Pocketsphinx is now listening.") // Log it.
    }
    
    // An optional delegate method of OEEventsObserver which informs that Pocketsphinx detected speech and is starting to process it.
    func pocketsphinxDidDetectSpeech() {
        print("Local callback: Pocketsphinx has detected speech.") // Log it.
    }
    
    // An optional delegate method of OEEventsObserver which informs that Pocketsphinx detected a second of silence, indicating the end of an utterance. 
    func pocketsphinxDidDetectFinishedSpeech() {
        print("Local callback: Pocketsphinx has detected a second of silence, concluding an utterance.") // Log it.
    }
    
    // An optional delegate method of OEEventsObserver which informs that Pocketsphinx has exited its recognition loop, most 
    // likely in response to the OEPocketsphinxController being told to stop listening via the stopListening method.
    func pocketsphinxDidStopListening() {
        print("Local callback: Pocketsphinx has stopped listening.") // Log it.
    }
    
    // An optional delegate method of OEEventsObserver which informs that Pocketsphinx is still in its listening loop but it is not
    // Going to react to speech until listening is resumed.  This can happen as a result of Flite speech being
    // in progress on an audio route that doesn't support simultaneous Flite speech and Pocketsphinx recognition,
    // or as a result of the OEPocketsphinxController being told to suspend recognition via the suspendRecognition method.
    func pocketsphinxDidSuspendRecognition() {
        print("Local callback: Pocketsphinx has suspended recognition.") // Log it.
    }
    
    // An optional delegate method of OEEventsObserver which informs that Pocketsphinx is still in its listening loop and after recognition
    // having been suspended it is now resuming.  This can happen as a result of Flite speech completing
    // on an audio route that doesn't support simultaneous Flite speech and Pocketsphinx recognition,
    // or as a result of the OEPocketsphinxController being told to resume recognition via the resumeRecognition method.
    func pocketsphinxDidResumeRecognition() {
        print("Local callback: Pocketsphinx has resumed recognition.") // Log it.
    }
    
    // An optional delegate method which informs that Pocketsphinx switched over to a new language model at the given URL in the course of
    // recognition. This does not imply that it is a valid file or that recognition will be successful using the file.
    func pocketsphinxDidChangeLanguageModel(toFile newLanguageModelPathAsString: String!, andDictionary newDictionaryPathAsString: String!) {
        
        print("Local callback: Pocketsphinx is now using the following language model: \n\(newLanguageModelPathAsString!) and the following dictionary: \(newDictionaryPathAsString!)")
    }
    
    // An optional delegate method of OEEventsObserver which informs that Flite is speaking, most likely to be useful if debugging a
    // complex interaction between sound classes. You don't have to do anything yourself in order to prevent Pocketsphinx from listening to Flite talk and trying to recognize the speech.
    func fliteDidStartSpeaking() {
        print("Local callback: Flite has started speaking") // Log it.
    }
    
    // An optional delegate method of OEEventsObserver which informs that Flite is finished speaking, most likely to be useful if debugging a
    // complex interaction between sound classes.
    func fliteDidFinishSpeaking() {
        print("Local callback: Flite has finished speaking") // Log it.
    }
    
    func pocketSphinxContinuousSetupDidFail(withReason reasonForFailure: String!) { // This can let you know that something went wrong with the recognition loop startup. Turn on [OELogging startOpenEarsLogging] to learn why.
        print("Local callback: Setting up the continuous recognition loop has failed for the reason \(reasonForFailure), please turn on OELogging.startOpenEarsLogging() to learn more.") // Log it.
    }
    
    func pocketSphinxContinuousTeardownDidFail(withReason reasonForFailure: String!) { // This can let you know that something went wrong with the recognition loop startup. Turn on OELogging.startOpenEarsLogging() to learn why.
        print("Local callback: Tearing down the continuous recognition loop has failed for the reason \(reasonForFailure)") // Log it.
    }
    
    /** Pocketsphinx couldn't start because it has no mic permissions (will only be returned on iOS7 or later).*/
    func pocketsphinxFailedNoMicPermissions() {
        print("Local callback: The user has never set mic permissions or denied permission to this app's mic, so listening will not start.")
    }
    
    /** The user prompt to get mic permissions, or a check of the mic permissions, has completed with a true or a false result  (will only be returned on iOS7 or later).*/
    
    func micPermissionCheckCompleted(withResult: Bool) {
        print("Local callback: mic check completed.")
    }

Using OELanguageModelGenerator+RuleORama

1

First, find the line

#import <OpenEars/OELanguageModelGenerator.h>

in your bridging header OpenEarsHeader.h and add the following line right underneath it:

#import <RuleORamaDemo/OELanguageModelGenerator+RuleORama.h>

Next, change this line where you create a language model:

let err: Error! = lmGenerator.generateLanguageModel(from: words, withFilesNamed: name, forAcousticModelAtPath: OEAcousticModel.path(toModel: "AcousticModelEnglish"))

to use this grammar and method instead:


let grammar = [
	ThisWillBeSaidOnce : [
		[ OneOfTheseCanBeSaidOnce : ["HELLO COMPUTER", "GREETINGS ROBOT"]],
		[ OneOfTheseWillBeSaidOnce : ["DO THE FOLLOWING", "INSTRUCTION"]],
		[ OneOfTheseWillBeSaidOnce : ["GO", "MOVE"]],
		[ThisWillBeSaidOnce : [
			[ OneOfTheseWillBeSaidOnce : ["10", "20","30"]], 
			[ OneOfTheseWillBeSaidOnce : ["LEFT", "RIGHT", "FORWARD"]]
			]],
		[ ThisCanBeSaidOnce : ["THANK YOU"]]
	]
]
 
 let err: Error! = lmGenerator.generateFastGrammar(from: grammar, withFilesNamed: name, forAcousticModelAtPath: OEAcousticModel.path(toModel: "AcousticModelEnglish"))

and change this line:

	let lmPath = lmGenerator.pathToSuccessfullyGeneratedLanguageModel(withRequestedName: name)

to this:

let lmPath = lmGenerator.pathToSuccessfullyGeneratedRuleORamaRuleset(withRequestedName: name)

This will allow you to recognize statements in accordance with this grammar, such as: HELLO COMPUTER DO THE FOLLOWING MOVE 10 LEFT THANK YOU or GREETINGS ROBOT INSTRUCTION MOVE 20 RIGHT but it will not recognize individual words or words in orders outside of the grammar. Please note that unlike the JSGF output type in stock OpenEars, RuleORama doesn't support the rule types with optional repetitions. Rules defined with repetitions will be composed into a rule with a single repetition. You can learn much more about how grammars work in OpenEars and RuleORama here.

Using OELanguageModelGenerator+Rejecto

1

First, find the line

#import <OpenEars/OELanguageModelGenerator.h>

in your bridging header and add the following line right underneath it:

#import <RejectoDemo/OELanguageModelGenerator+Rejecto.h>

Next, change this line where you create a language model:

let err: Error! = lmGenerator.generateLanguageModel(from: words, withFilesNamed: name, forAcousticModelAtPath: OEAcousticModel.path(toModel: "AcousticModelEnglish"))

to use this method instead:


 let err: Error! = lmGenerator.generateRejectingLanguageModel(from: words, withFilesNamed: name, withOptionalExclusions: nil, usingVowelsOnly: false, withWeight: 0.0, forAcousticModelAtPath: OEAcousticModel.path(toModel: "AcousticModelEnglish"))

You will use the same array for languageModelArray and the same files name for fileName as you did with the old generateLanguageModelFromArray method, and to get started you can use the value "nil" for optionalExclusions, vowelsOnly, and weight, since they are there to help you refine your results and might not be needed. You can learn more about fine-tuning your results with those optional parameters in the Rejecto documentation.

Using OEPocketsphinxController+RapidEars

1

Like OEPocketsphinxController which it extends, we need a language model created with OELanguageModelGenerator before using OEPocketsphinxController+RapidEars. We have already completed that step above.

2

Add the following to your bridging header, after the line #import <OpenEars/OEPocketsphinxController.h>:

#import <RapidEarsDemo/OEPocketsphinxController+RapidEars.h>

Next, comment out all calls in your app to the method

OEPocketsphinxController.sharedInstance().startListeningWithLanguageModel(atPath: lmPath, dictionaryAtPath: dicPath, acousticModelAtPath: OEAcousticModel.path(toModel: "AcousticModelEnglish"), languageModelIsJSGF: false)

and in the same part of your app where you were formerly using this method, place the following:


OEPocketsphinxController.sharedInstance().startRealtimeListeningWithLanguageModel(atPath: lmPath, dictionaryAtPath: dicPath, acousticModelAtPath: OEAcousticModel.path(toModel: "AcousticModelEnglish"))

If you find that sometimes you are getting live recognition and other times not, make sure that you have definitely replaced all instances of startListeningWithLanguageModelAtPath: with startRealtimeListeningWithLanguageModelAtPath:.

Using OEEventsObserver+RapidEars

1

At the top of your bridging header after the line

#import <OpenEars/OEEventsObserver.h>

Add the line

#import <RapidEarsDemo/OEEventsObserver+RapidEars.h>

And where you have imported the OEEventsObserver delegate like this:

class ViewController: UIViewController, OEEventsObserverDelegate {

add the OEEventsObserverRapidEarsDelegate delegate protocol like this:

class ViewController: UIViewController, OEEventsObserverDelegate, OEEventsObserverRapidEarsDelegate {

And after this OEEventsObserver delegate function you added to your implementation when setting up your OpenEars app:

    func pocketsphinxDidReceiveHypothesis(_ hypothesis: String!, recognitionScore: String!, utteranceID: String!) { // Something was heard
        print("Local callback: The received hypothesis is \(hypothesis!) with a score of \(recognitionScore!) and an ID of \(utteranceID!)") 
    }

Just add the following extended delegate methods:

    func rapidEarsDidReceiveLiveSpeechHypothesis(_ hypothesis: String!, recognitionScore:String!) {
        print("rapidEarsDidReceiveLiveSpeechHypothesis: \(hypothesis!)")
    }
    
    func rapidEarsDidReceiveFinishedSpeechHypothesis(_ hypothesis: String!, recognitionScore:String!) {
        print("rapidEarsDidReceiveFinishedSpeechHypothesis: \(hypothesis!)")
    }

Using OEFliteController+NeatSpeech

1

OEFliteController+NeatSpeech preconditions

In order to use NeatSpeech, as well as importing the framework into your OpenEars-enabled project, it is also necessary to import the voices and voice data files by dragging the "Voice" folder in the disk image into your app project (once your app is working you can read more here about how to remove the elements you don't need in order to have a small app binary size).

Very important: when you drag in the voices and framework folders, make sure that in Xcode's "Add" dialog, "Create groups for any added folders" is selected. Make sure that "Create folder references for any added folders" is not selected or your app will not work.

For the last step, go to the Build Phases pane and under "Link Binary With Libraries", select the library "libc++.tbd". If you receive errors like "Undefined symbols for architecture i386: std::basic_ios >::widen(char) const", that means that this step needs special attention.

2

OEFliteController+NeatSpeech implementation

OEFliteController+Neatspeech simply replaces OEFliteController's voice type with the advanced NeatSpeech voice type.

In your bridging header replace this:

#import <Slt/Slt.h>
#import <OpenEars/OEFliteController.h>

with this:

#import <Beatrice/Beatrice.h>
#import <OpenEars/OEFliteController.h>
#import <NeatSpeechDemo/OEFliteController+NeatSpeech.h>

in your view controller replace this:

var slt = Slt()

with this:

var beatrice = Beatrice.init(pitch: 0.0, speed: 0.0, transform: 0.0)

and replace this:

fliteController.say("a statement", with: slt)

with this:

fliteController.say(withNeatSpeech: "Alice was getting very tired of sitting beside her sister on the bank, and having nothing to do: once or twice she had peeped into the book her sister was reading, but it had no pictures or conversations in it, and what is the use of a book, thought Alice, without pictures or conversations?", with: beatrice)

And replace any other calls to say(withVoice) with sayWithNeatSpeech(withVoice) Once this is definitely working you can remove the Slt or other Flite voice frameworks from your app to reduce app size. You can replace references to the Beatrice framework and object with any of the other voices to try them out.

The available voice frameworks you'll find in the Voices folder in the distribution are as follows:

Emma (US English, female)
EmmaAlternate (US English, female)
William (US English, male)
WilliamAlternate (US English, male)
Beatrice (UK English, female)
Elliott (UK English, make)
Daniel (Castilian Spanish, male)
Martina (Castilian Spanish, female)
Mateo (Latin American Spanish, male)
Valeria (Latin American Spanish, female)

You can also change the speaking speed, pitch of the voice, and inflection of each voice using the voice's initializer arguments speed, pitch and transform respectively. As an example, to initialize the Beatrice voice with a higher pitch you could use the following initialization: var beatrice = Beatrice.init(pitch: 0.2, speed: 0.0, transform: 0.0)

Once you know how your project is to be configured you can remove the unused voices following these instructions in order to make your app binary size as small as possible.

You can pass the sayWithNeatSpeech(withVoice) function as much data as you want at a time. It will process the speech in phases in the background and return it for playback once it is ready. This means that you should rarely experience long pauses while waiting for synthesis, even for very long paragraphs. Very long statements need to include pause indicators such as periods, exclamation points, question marks, commas, colons, semicolons, etc.

To interrupt ongoing speech while it is in progress, send the message fliteController.stopSpeaking(). This will not interrupt speech instantaneously but halt it at the next available opportunity.

Using OEEventsObserver+SaveThatWave

1

At the top of your bridging header OpenEarsHeader.h after the line

#import <OpenEars/OEEventsObserver.h>

Add the line

#import <SaveThatWaveDemo/OEEventsObserver+SaveThatWave.h>

And after this OEEventsObserver delegate functions you added to your implementation when setting up your OpenEars app, add the following extended delegate method:

    func wavWasSaved(atLocation location: String!) {
     print("wav was saved at location \(location!)")   
    }

Using SaveThatWaveController

1

Add the following lines to your bridging header. Under the imports at the very top:

#import <SaveThatWaveDemo/SaveThatWaveController.h>

2

Add the following to your swift file: In your view controller class declaration: var saveThatWaveController = SaveThatWaveController() Then, after whichever line in which you start listening: You can add the line:

self.saveThatWaveController.start() // For saving WAVs from OpenEars or RapidEars