Halle Winkler

Forum Replies Created

Viewing 100 posts - 1,801 through 1,900 (of 2,171 total)

Advertisement: “Don't want OpenEars™ to guess one of your vocabulary words when it hears an unknown word? Rejecto can help!”

Author

Posts
February 26, 2013 at 6:41 pm in reply to: AVFoundation and STT (Speech-To-Text) Recognition #1015746

Halle Winkler
Politepix

Hi Marco,

In fact, all audio sessions are shared because the audio session is a singleton. OpenEars’ AudioSessionManager is also a singleton. So wherever you address it from, the results will be shared across the app and through the shared audio session because they always go to the same AudioSessionManager and the same audio session. Does that make sense? Or was the question about sharing in a different sense?

February 25, 2013 at 3:20 pm in reply to: getting speech volume ? #1015737

Halle Winkler
Politepix

Not as such — it attempts to approximate the negative decibel offset range that an AudioQueue can return, but it doesn’t guarantee it. The range should be a float that goes from some negative value (something like -74) up to zero, where zero is the maximum.

February 25, 2013 at 3:02 pm in reply to: getting speech volume ? #1015735

Halle Winkler
Politepix

Hello,

You can get the incoming speech power using the result of the method pocketsphinxInputLevel (you have to listen to this value on a background thread since it will block the main thread if you don’t — there is an example of doing this in the sample app). Knowing what this power level means in terms of speaking/loud speaking/shouting is a subjective interpretation, so deciding what different power levels mean in those terms is an implementation detail for your app.

February 23, 2013 at 9:21 am in reply to: Vocab on Server #1015726

Halle Winkler
Politepix

Welcome,

No, all of the products are for 100% offline recognition/TTS.

February 22, 2013 at 4:24 pm in reply to: Won't Recognize Q, CUE, or QUEUE #1015718

Halle Winkler
Politepix

Love to hear that :) .

February 22, 2013 at 4:02 pm in reply to: Won't Recognize Q, CUE, or QUEUE #1015716

Halle Winkler
Politepix

Also turn on OpenEarsLogging and verboseCMUCLMTK so you get any relevant output from the process of generating the language models.

February 22, 2013 at 4:01 pm in reply to: Won't Recognize Q, CUE, or QUEUE #1015715

Halle Winkler
Politepix

Yes, step one is definitely making sure that these new words are present in your language model and phonetic dictionary. Also, turn on verbosePocketsphinx so you receive any complaints from pocketsphinx about your language model or dictionary.

February 22, 2013 at 3:49 pm in reply to: Won't Recognize Q, CUE, or QUEUE #1015712

Halle Winkler
Politepix

The issue of doing recognition with several individual words that are only a syllable long and all rhyme with each other is not a satisfactorily-solved issue in speech recognition. This is another variation of the general issue of recognition of the English alphabet, which you can read people trying to find fixes for in every speech-recognition-related resource, unfortunately. There is no contextual cue for which one is the “real one” in the case you’re describing so as soon as there is any distance from the mic, the sounds are going to get mixed up.

The strategy for dealing with it is going to be some combination of removing confusing words from the model and fusing multiple words together that you know will be spoken together.

An example is that you don’t need the loose letter “U” if its presence there is just in order to let “U.S.” be recognized. In that case, make the word “U.S.”:

U.S. Y UW AH S

This will also improve the accuracy of words that are spoken near utterances of “U.S.”.

The next issue I see is that the “Q1” etc segment has a couple of obscure words before and after it, which suggests to me that this is a big language model. Do you have the opportunity to switch between smaller, more contextually-specific language models?

Can you do counting in either its own language model that you switch to, or with some kind of prefix? e.g. “Category 2” instead of just “2”.

The last thing is that you haven’t shown the entry in the language model or the pocketsphinx logging output, so I don’t know for sure whether your alteration is actually in your language model as far as pocketsphinx is concerned. If you remove “U” and “2”, are you able to recognize “Q1”? If not, there might be an issue in the language model in general.

In case you have confirmed that the language model is OK, and none of these approaches are options for you (although they are almost always options for an app that you can make design decisions about), the last possibility is to do it as a JSGF ruleset rather than a statistical ARPA model. Searching this forum for JSGF should help you get started.

February 21, 2013 at 11:17 pm in reply to: Won't Recognize Q, CUE, or QUEUE #1015708

Halle Winkler
Politepix

Here’s the skinny on making custom language models before runtime:

https://www.politepix.com/2012/11/02/openears-tips-1-create-a-language-model-before-runtime-from-a-text-file/

and editing your phonetic dictionary:

https://www.politepix.com/2012/12/04/openears-tips-and-tricks-5-customizing-the-master-phonetic-dictionary-or-using-a-new-one/

February 21, 2013 at 11:15 pm in reply to: Won't Recognize Q, CUE, or QUEUE #1015707
Halle Winkler
Politepix
Hmm, single letters that rhyme with other single letters are very challenging for recognition.

Since you already know the number of required quarters, something sneaky you can try is to have Q1 etc be the entire word, that is, instead of trying to recognize the combination of Q and 1, you will have a word “Q1” in your language model and you’ll edit the dictionary used so that the entry for Q1 reads as follows:
```
Q1   K Y UW W AH N
```
You’d do this for each quarter. Having the multiple syllable/sound combinations available for distinguishing between the quarters should make them recognizable.
February 20, 2013 at 12:46 pm in reply to: Add a document to recognize TTS instead of passing a string in open ears sample #1015693

Halle Winkler
Politepix

Hi Dalee,

Nope, that won’t work for an .epub. Ebook reading is an implementation question that is part of how you design and construct your app. OpenEars is a speech API so it doesn’t try to address app implementation questions like the best way to process ebooks into readable NSStrings. I’m sure that searching Stack Overflow for the keywords that are part of your implementation question will result in lots of useful information about where to get started.

February 20, 2013 at 12:34 pm in reply to: Add a document to recognize TTS instead of passing a string in open ears sample #1015691

Halle Winkler
Politepix

Welcome Dalee,

You can’t pass in a document, but the code to convert a text file into an NSString is just 1-2 lines, here’s an example:

http://stackoverflow.com/questions/2171341/how-to-get-the-contents-of-a-text-file-stored-locally-in-the-documents-director

February 19, 2013 at 3:12 pm in reply to: NeatSpeech) Speech is interrupted(text of the long sentences, no punctuation) #1015685

Halle Winkler
Politepix

I wanted to mention that in order to look into this issue it is still important to receive an example of this which occurs naturally in your app, since it is needed in order to design an appropriate fix and to assign the fix a priority. So far, there has never been a bug report of this occurring “in the wild” because it was designed to be an improbable event, so it would be good to get a real example of how it occurred in your app in a form that prevents the pause token from being used, in order to understand the first case of it appearing as an issue.

If your app reads different sources, just let me know about a source which leads to this issue — you can also inform me by email in order to keep it private.

February 19, 2013 at 10:47 am in reply to: NeatSpeech) Speech is interrupted(text of the long sentences, no punctuation) #1015683

Halle Winkler
Politepix

Hi Hitoshi,

Your English is great and I know how it is to have to speak with subtlety in a second language, since I have to use a second language frequently and also worry that I sound too brusque when I am making requests.

I will definitely take your suggestion on board and consider the best way to integrate it in a future version, thank you for your suggestions. The most likely solution to the long unpunctuated speech question will just be to force a split on long unpunctuated text streams, since the lack of punctuation means there will be no contextual cues and the location of the split can be arbitrary (meaning: without any punctuation, we don’t know where the writer of the sentence meant to have clauses or emphasis, so we can split it anywhere because there is no better option). I’m not sure if I want to change the API to accomplish this goal but I will consider what you said.

To give you a preview for how I would be likely to do the long sentence splitting, I will probably implement an NSScanner to count incidents of whitespace between words and pick an arbitrary value that are “too many” spaces without there also being any punctuation or pauses, and insert a pause token before sending it to synthesis. So you could also use this as your workaround right now if you need this immediately.

February 17, 2013 at 10:16 pm in reply to: (RapidEars) rapidEarsDidDetectLiveSpeechAsWordArray fired more then once #1015664

Halle Winkler
Politepix

Understood, but I’ve also recommended a couple of other approaches — you could keep using partials but use the higher-quality algorithm which will return fewer times (but be more accurate) by setting:

[self.pocketsphinxController setFasterPartials:FALSE];
[self.pocketsphinxController setFasterFinals:FALSE];

Those will still return partials, so that isn’t the OpenEars-style approach.

Or you could just ignore partials that don’t match the utterance you are trying to detect, and short circuit your listening at the time that you receive a partial that matches the utterance you are trying to detect. There’s no requirement to display every partial to the user or run a method based on every one; you can also just check in the callback to see if it is the matching hypothesis and only react to the first one that is.

My other suggestion was to compare an incoming hypothesis to the previous one and only display/invoke new methods when it no longer matches the hypothesis that came before, i.e. throwing out any repeated hypotheses for logical purposes.

So there are a few ways that you can get the results you are looking for — it’s really a question of what is the best approach for your application goals.

February 17, 2013 at 10:01 pm in reply to: (RapidEars) rapidEarsDidDetectLiveSpeechAsWordArray fired more then once #1015662

Halle Winkler
Politepix

That’s what I would expect – first the first word in the utterance is spoken, then the second, and the hypothesis grows along with the number of words spoken. Your utterance starts with a single word and then a second is added, so the hypothesis matches that.

Something you could try if you want fewer callbacks is to experiment with the following settings in your initialization code for the PocketsphinxController+RapidEars object:

To use a slightly-less “live” method you can use these settings:

[self.pocketsphinxController setFasterPartials:FALSE];
[self.pocketsphinxController setFasterFinals:FALSE];

Alternately, you can ignore partial hypotheses (like your initial ones that just say “WORD” once) and wait for the final hypotheses in the rapidEarsDidDetectFinishedSpeechAsWordArray callback, but request them faster using these settings:

[self.pocketsphinxController setFinalizeHypothesis:TRUE];
[self.pocketsphinxController TRUE];

February 17, 2013 at 9:44 pm in reply to: (RapidEars) rapidEarsDidDetectLiveSpeechAsWordArray fired more then once #1015660

Halle Winkler
Politepix

A hypothesis should appear as “word word” if they say “word word” and just as “word” if they just say “word”. So in that case, I think you’d be fine just suspending listening when you catch and react to the hypothesis you are waiting for the first time. Would that fit your requirement?

February 17, 2013 at 9:21 pm in reply to: (RapidEars) rapidEarsDidDetectLiveSpeechAsWordArray fired more then once #1015657

Halle Winkler
Politepix

RapidEars works a bit differently than stock OpenEars and needs a slightly different approach to programming. With OpenEars, it waits until speech is complete and returns a single hypothesis, so you can just assume that there will be a one to one relationship with receiving a hypothesis and your programmatic reaction to receiving a hypothesis. RapidEars is continuously processing when it has started to detect speech, meaning that it will keep reasserting the present hypothesis until it changes or the utterance ends, because it is continuously re-scoring the hypothesis (seeing if it becomes more confident in a different hypothesis, or more confident in the current hypotheses, and seeing if more words are spoken after the current hypotheses, etc).

You can work with this style in a couple of ways. One way is just to react to your keyword the first time you catch it and not worry about the rest of the hypotheses , i.e. short circuit the listening process when you get a match. Another way (I think this is closer to your request) would be to call a “keywordDetected:” method in the callback, but only call it for a new word (meaning you store the hypothesis and only forward the method if the new hypothesis doesn’t match the stored hypothesis, meaning it is a different hypothesis. Does that make sense?

February 17, 2013 at 6:40 pm in reply to: Application size is 30mb after implementation #1015655

Halle Winkler
Politepix

Welcome,

Sure, check out the FAQ: https://www.politepix.com/openears/support

February 17, 2013 at 1:09 am in reply to: NeatSpeech) Speech is interrupted(text of the long sentences, no punctuation) #1015649

Halle Winkler
Politepix

Hi Hitoshi,

Can you give an example of a real sentence from your app that you need to say that encounters this limit and is not possible to place any commas, periods, or the pause token in? The sentence consisting of >20 repetitions of the letter w in a row doesn’t look like something that occurs in a real app interaction, but if I’m not correct about that, you could perform it in your app without a problem by programmatically placing a comma or a pause token in between the middle two “w”s, or between all of them. Unfortunately it isn’t possible to return anything from say:withNeatSpeech:usingVoice: because it is an asynchronous method and the time of the utterance is known after synthesizing it.

The size of the maximum unpunctuated utterance was intentionally chosen based on the fact that it is several times larger than real sentence clauses in English. You can see this in the case of your choice of “w”, which contains the syllables of two complete words (‘double’ and ‘you’). In order for a sentence clause to occur which needed to render as many syllables as your test sentence, it would need to be an unpunctuated clause containing ~52 words. I’m not aware of a clause like this. But the pause token was added to the API specifically so that you would never need to have your users hear speech that is cut off, since you can programmatically insert it into long text that lacks punctuation.

I’m not opposed to examining this in the long term, but I’d want to start with a real usage case that is creating an issue for someone in their app.

February 15, 2013 at 12:03 pm in reply to: T2S breaks video/audio capturing in progress #1015638

Halle Winkler
Politepix

Super, happy to hear it was so easy!

February 15, 2013 at 10:40 am in reply to: T2S breaks video/audio capturing in progress #1015636

Halle Winkler
Politepix

Got it, OK. TTS is the conventional term (wanted to let you know so if you need to ask questions elsewhere folks will immediately know what it’s about). You’re correct, FliteController is reasserting a PlayAndRecord audio session and you’re right that it only needs playback if you have no use for speech recognition. What you can try is to change the audio session that AudioSessionManager asserts and then recompile the framework and see if it helps. You’ll make the change in AudioSessionManager.m. If you look around line 271 of that file you’ll see a commented-out list of audio session types, and you’ll basically want to do a search and replace for kAudioSessionCategory_PlayAndRecord, replacing it with the other audio session types you want to try as the audio session that FliteController reasserts before its playback. You could also comment out invocations to AudioSessionManager in the framework and see if it just does the right thing without any interaction with the audio session.

February 15, 2013 at 10:18 am in reply to: T2S breaks video/audio capturing in progress #1015634

Halle Winkler
Politepix

Welcome,

Is T2S referring to OpenEars’ text to speech, or is that another framework you are also using?

February 14, 2013 at 9:40 am in reply to: NeatSpeech) the question about the speaking speed #1015625

Halle Winkler
Politepix

Hello Hitoshi,

It is dependent on encapsulated details of the voice (it is specific to each voice) and it is also subject to change, so I can’t easily answer this question with a formula. If you’re curious about this for a specific voice, I think the full logging might give you enough information that you could time the playback and find out.

February 14, 2013 at 9:36 am in reply to: NeatSpeech) Speech is interrupted(text of the long sentences, no punctuation) #1015624

Halle Winkler
Politepix

Hello Hitoshi,

This isn’t a bug, there is a maximum length that is possible with a single utterance that has no punctuation in order to prevent unacceptable memory overhead. The maximum length is longer than unpunctuated sentences ever are in English. Just continue to use punctuation or add the pause token that is described in the documentation.

February 13, 2013 at 9:06 am in reply to: NeatSpeech Model #1015617

Halle Winkler
Politepix

Yes, it is a required part of the framework.

February 5, 2013 at 5:28 pm in reply to: When will a Wave file start saving? #1015562

Halle Winkler
Politepix

If you are worried that a recognition might sometimes take less time than a file write (I’m not sure I would worry about this in practice), you can keep an int property that is the number of WAV writeouts and another that is the number of returned hypotheses and do something special when a hyp returns and the number of hyps is larger than the number of WAV writeouts. It is also a good way to test whether I’m right.

February 5, 2013 at 5:09 pm in reply to: When will a Wave file start saving? #1015560

Halle Winkler
Politepix

That would definitely be my expectation. Since this is the first time I’ve heard of this use, it would be great if you could let me know if that expectation is correct, but I’d be pretty surprised if it weren’t.

February 5, 2013 at 4:55 pm in reply to: When will a Wave file start saving? #1015558

Halle Winkler
Politepix

If so, then if that date is earlier than hypothesis callback timestamp, I can safely suppose that they’re paired (the wave file with a given hypothesis)?

I would expect this assumption to be correct. It is difficult to imagine a circumstance under which a hypothesis will return before the wav is able to be written out. But you won’t always have a matching hypothesis for a WAV because not every hypothesis has content. You can decide what you’d like to do about those cases since it will vary from project to project.

Should I stop my saveThatWaveController on every detectFinishedSpeech/stopListening/suspendRecognition, or it will be stopped automatically (so the wasSaved will be invoked on every case )?

You should just leave it running; it will only do something when it has speech to save.

Sorry for this many question, but that wavWasSaved holds really no information about the content in it.

The only thing it knows about is the timestamp, since it isn’t bound to recognition, which is a positive thing because it means you can run it with recognition turned off in order to use OpenEars as a voice activity detector, or with n-best recognition on (many hypotheses for a single utterance, some of which are null) or when a null hypothesis is silently returned such as happens for the majority of hypotheses when using Rejecto. Although SaveThatWave works well with hypotheses, it saves all detected speech, not all non-null hypotheses.

January 30, 2013 at 4:48 pm in reply to: PocketSphinx stops listening after I change language file. #1015517

Halle Winkler
Politepix

Which name are you referring to? The file name that you pass to LanguageModelGenerator when you are requesting the generated model or something else?

January 30, 2013 at 8:50 am in reply to: PocketSphinx stops listening after I change language file. #1015512

Halle Winkler
Politepix

Hi Giebler,

It’s a requirement that language models have unique names, not really a bug, but I’ll mention it in the docs for clarity.

January 29, 2013 at 10:39 am in reply to: PocketSphinx stops listening after I change language file. #1015498

Halle Winkler
Politepix

The first step is turning on verbosePocketsphinx so you can see any recognition errors or warnings in the console. Is it an ARPA model or a JSGF grammar? You do need to give your models unique names.

January 25, 2013 at 5:32 pm in reply to: Soft-start Engine #1015476

Halle Winkler
Politepix

Sort of — these are all things that happen when the engine is started (calibration, listening, language model switching), so they aren’t responsible for starting it. Switching language models is something you can do while listening is in progress so the impression that it starts listening comes from the context in which you are preventing entry into the listening loop.

I think what you’re seeing is that the overall listening method is recursive, so events which return it to the top of the loop will end-run your method of preventing recognition. I think the startup time is just a second or so, are you seeing significantly longer waits to start?

January 25, 2013 at 3:40 pm in reply to: Soft-start Engine #1015474

Halle Winkler
Politepix

Yup, for speech recognition the optimal environment is always as quiet as possible, since background noise will either occlude the speech or cause an attempt to recognize it. So if the users are in the car and they are just using the built-in phone mic, it’s a good suggestion for them to turn off the radio. The important thing about calibration is that it is done on an environment that matches the speech environment, meaning that if the user is going to talk over the radio even if you suggest that they not do that, you want the radio on during calibration because silence in that case means “the user isn’t talking but there is quieter radio noise running in the background”.

January 25, 2013 at 3:28 pm in reply to: Soft-start Engine #1015472

Halle Winkler
Politepix

Welcome,

This is not actually advisable, because the lag is the voice activity detection checking the noise levels in the room and calibrating itself to distinguish between silence and speech in the current conditions before the user starts speaking. If this is done at some arbitrary time before the user is just about to talk, the calibration isn’t being performed for the environment which exists in the timeframe in which the user is speaking. This will lead to error-prone recognition.

January 24, 2013 at 7:28 pm in reply to: Optimizing open ears for single word recognition #1015467

Halle Winkler
Politepix

Looks that way to me. I haven’t tested that code so you’re a bit on your own with it, but those look like constants so I expect that you’d want to uncomment the one corresponding to the recording type you want to use.

January 24, 2013 at 6:24 pm in reply to: Any callback on sayWithNeatSpeech; completition? #1015465

Halle Winkler
Politepix

OK, I will enter it as a feature request.

January 24, 2013 at 6:10 pm in reply to: Any callback on sayWithNeatSpeech; completition? #1015463

Halle Winkler
Politepix

You should be able to catch the end of a NeatSpeech utterance using the standard OpenEars OpenEarsEventsObserver delegate method – (void) fliteDidFinishSpeaking, is that not working for you? Receiving the actual string is not available.

January 23, 2013 at 5:05 pm in reply to: Optimizing open ears for single word recognition #1015457

Halle Winkler
Politepix

OK, yes the steps described in that post should help you with insufficient sensitivity, although please keep the downside in mind — if you increase the sensitivity you will also have more incidental noises triggering recognition (chirping birds, etc). Sometimes sensitivity issues are related to doing testing in a single environment and if you optimize for increasing or decreasing sensitivity based on one environment you will see a decline in performance on the other end of the spectrum (developers usually like to work in quiet environments but users like to do speech recognition in noisy ones). Just wanted to mention that issue in advance.

When you make a change to the framework, you need to clean and build the framework project (OpenEars.xcodeproj). Then the new framework should be picked up in your app the next time you build and clean. You can test this by selecting the framework file in your app project and selecting “view in finder” and then when it takes you to the finder file, do “get info” and see if its last modified date is the time that you built the framework project.

January 23, 2013 at 12:14 pm in reply to: Optimizing open ears for single word recognition #1015454

Halle Winkler
Politepix

You can’t adjust the sensitivity of the voice activity detection, sorry. It sounds like it is too sensitive, is that correct? If the issue is that the users are speaking too quietly or from too far away for recognition to be accurate when it is triggered, this is generally something that is best addressed with user education: “MyApp works best if you speak clearly from no more than $DISTANCE away”. If the issue is that utterances are triggering recognition that are not user speech that relates to the app, Rejecto was designed to give a null result under that circumstance rather than a wrong hypothesis. If I have it backward and the issue is insufficient mic sensitivity, you could experiment with the suggestions in this thread:

https://www.politepix.com/forums/topic/add-mode-options-next-version/

January 22, 2013 at 3:08 pm in reply to: Any recommendation on Hypothesis scoring limits? #1015441

Halle Winkler
Politepix

Hello,

Here are the previous discussions:

https://www.politepix.com/forums/topic/how-does-recognition-score-works/
https://www.politepix.com/forums/topic/scores-used-in-openears/

You can’t use ranges to limit because you don’t know what the environmental factors are, but something you can experiment with is detecting cases where the top n-best and the second-closest n-best are close and the overall range is large. i.e. 1-best is -50000 and 2-best is -50500 suggests to me that it’s a close call. But you can’t say “below -50000 is wrong”.

January 22, 2013 at 10:57 am in reply to: Init NeatSpeech: Error: flite_hts_engine: specify models(trees) for each para… #1015438

Halle Winkler
Politepix

Glad it’s working for you.

January 21, 2013 at 7:04 pm in reply to: Init NeatSpeech: Error: flite_hts_engine: specify models(trees) for each para… #1015434

Halle Winkler
Politepix

Welcome,

This is happening because you dragged in the voice frameworks, but you didn’t drag in the folder called VoiceData that is also in the demo disk image. So the frameworks are added to your project but the actual voice data is not (those are the models that it is complaining about not having). I can work on returning a more informative error under this circumstance.

When you drag in the VoiceData folder from the disk image, make sure that in the add dialog “Create groups for any added folders” is selected and NOT “Create folder references for any added folders” so that the VoiceData folder models are added at the right location in your bundle, and clean your project before rebuilding to reduce the likelihood of Xcode being spooky. Let me know if that helps.

January 19, 2013 at 5:01 pm in reply to: Pause playback #1015421

Halle Winkler
Politepix

Cool, glad to hear it.

January 19, 2013 at 3:12 pm in reply to: Pause playback #1015419

Halle Winkler
Politepix

Hello,

This isn’t a feature of FliteController, but it could be added to stock OpenEars pretty easily by adding a new method to pause/unpause FliteController’s AVAudioPlayer if it is playing. In NeatSpeech your option is to send a request to stop, but this will stop at the earliest opportunity to finish an utterance rather than pausing immediately.

January 19, 2013 at 8:59 am in reply to: Dynamic Grammar Generation #1015412

Halle Winkler
Politepix

Sorry about the tag removal, I haven’t yet figured out a way to let people paste their whole JSGF grammar without also allowing arbitrary HTML (which is a security issue).

Now my question is whether it’s better to supply the generateLanguageModelFromArray method with a list of the single words that comprise the sentences and let OpenEars/RapidEars figure out the sentences or should I give it a list of all the possible sentences?

I would give the entire sentences; this will cause LanguageModelGenerator to give increased probability to the sentence word sequence.

Is there a way of obtaining something similar to what I’m doing with JSGF using the LanguageModelGenerator?

Not exactly since JSGF and ARPA models address two very different design issues. Unfortunately RapidEars doesn’t currently support JSGF, but lately there has been a lot more usage of JSGF in OpenEars (I think this is because performance on the device has improved to the extent that the performance hit for using JSGF isn’t as arduous as it used to be) so I will give adding it to RapidEars some thought, even though JSGF will definitely be less rapid.

One thing that it occurs to me that you can try, and I guess this is how I would attempt to solve this, would be to change the probabilities in your language model by hand to raise them for bigrams and trigrams which represent the target sentences and lower them for unigrams that represent “loose” words. You’ll need to use the LanguageModelGenerator-produced .arpa file as your language model rather than the .DMP and open it up in a text editor. I would start by only changing the probability of the trigrams (the probability is the value at the start of the line). The probability is from a negative number to zero where zero is neutral and less than zero is lowered probability.

January 10, 2013 at 11:07 pm in reply to: #1015304

Halle Winkler
Politepix

Awesome! You’re very welcome.

January 10, 2013 at 5:28 pm in reply to: #1015300

Halle Winkler
Politepix

It is possible to turn a DMP back into a text file, but you don’t actually need to do that. The DMP isn’t part of the acoustic model, it is just a plain probability model that says “this or that word is most likely to be combined with the other word when the user speaks, or be by itself, etc”. It doesn’t know anything about what the words sound like, it just helps accuracy by trying to predict tendencies in speech based purely on the input strings that you give it. LanguageModelGenerator should be able to create your DMP just fine so you don’t have to do anything with that. The issue is that the phonetic dictionary lookup method that LanguageModelGenerator uses currently expects that the phonemes will match an acoustic model with English language phonemes, so it’s the .dic that it produces that you have to throw out. If you start Pocketsphinx using a DMP that LanguageModelGenerator creates, but you create your own .dic by hand using selected entries from the acoustic model’s .dic file, I think it should work for you.

January 10, 2013 at 3:44 pm in reply to: #1015297
Halle Winkler
Politepix
Ha, no problem, that’s why everyone edits the logs :) . My biggest objection to folks reporting issues with the simulator is when they post OMG TEH ACCURACY !!!ELEVEN!1! on Stack Overflow and then it turns out that they were testing accuracy on the simulator.

But LanguageModelGenerator and basic things like finding file paths should be the same between the simulator and the device because they are deterministic and don’t rely on an audio driver, and if you’ve read up enough to have gotten my repeated suggestions not to test accuracy on the simulator than its fine to test something like this. Keep in mind that one thing you won’t get to see when you use the simulator is what will happen with constrained resources, meaning that if the DMP and dic files with the Russian model contain tens of thousands of words, they will probably work on the simulator but possibly crash or at least ridiculously underperform on the device.

OK, so let’s level-set again because I’m a little confused between the Russian model/Spanish model situation. It looks to me like you are now just testing the Spanish model, is that correct? Can you first start by making absolutely sure that you remove all acoustic model files from your app (I would just start a new app from scratch at this point) and then re-add just the model that you are using? It’s really common that when developers are working with multiple language models they get issues due to having the acoustic model files mixed from two different models at the root of their bundle, and these issues are really hard for me to troubleshoot because the files are found by Pocketsphinx so there is no overt error or warning.

Next, I think there is going to be a general issue with the fact that LanguageModelGenerator only works with English-language phonemes and the acoustic models you are using do not use the same phoneme set. You can see that right here:
```
ERROR: “dict.c”, line 193: Line 1: Phone ‘EY’ is mising in the acoustic model; word ‘A’ ignored
ERROR: “dict.c”, line 193: Line 2: Phone ‘AH’ is mising in the acoustic model; word ‘ABANDONA’ ignored
ERROR: “dict.c”, line 193: Line 3: Phone ‘AH’ is mising in the acoustic model; word ‘ABANDONADA’ ignored
ERROR: “dict.c”, line 193: Line 4: Phone ‘AH’ is mising in the acoustic model; word ‘ABANDONADAS’ ignored
ERROR: “dict.c”, line 193: Line 5: Phone ‘AH’ is mising in the acoustic model; word ‘ABANDONADO’ ignored
ERROR: “dict.c”, line 193: Line 6: Phone ‘AH’ is mising in the acoustic model; word ‘ABANDONADOS’ ignored
ERROR: “dict.c”, line 193: Line 7: Phone ‘AH’ is mising in the acoustic model; word ‘ABANDONAN’ ignored
ERROR: “dict.c”, line 193: Line 8: Phone ‘AH’ is mising in the acoustic model; word ‘ABANDONAR’ ignored
ERROR: “dict.c”, line 193: Line 9: Phone ‘AH’ is mising in the acoustic model; word ‘ABANDONARA’ ignored
ERROR: “dict.c”, line 193: Line 10: Phone ‘AH’ is mising in the acoustic model; word ‘ABANDONARLA’ ignored
ERROR: “dict.c”, line 193: Line 11: Phone ‘AH’ is mising in the acoustic model; word ‘ABANDONARON’ ignored
ERROR: “dict.c”, line 193: Line 12: Phone ‘AH’ is mising in the acoustic model; word ‘ABANDONE’ ignored
ERROR: “dict.c”, line 193: Line 13: Phone ‘EH’ is mising in the acoustic model; word ‘BREGO’ ignored
ERROR: “dict.c”, line 193: Line 14: Phone ‘EH’ is mising in the acoustic model; word ‘F’ ignored
ERROR: “dict.c”, line 193: Line 15: Phone ‘IY’ is mising in the acoustic model; word ‘FRICA’ ignored
ERROR: “dict.c”, line 193: Line 16: Phone ‘AY’ is mising in the acoustic model; word ‘I’ ignored
ERROR: “dict.c”, line 193: Line 17: Phone ‘EY’ is mising in the acoustic model; word ‘K’ ignored
ERROR: “dict.c”, line 193: Line 18: Phone ‘EH’ is mising in the acoustic model; word ‘LVAREZ’ ignored
ERROR: “dict.c”, line 193: Line 19: Phone ‘EH’ is mising in the acoustic model; word ‘LVARO’ ignored
ERROR: “dict.c”, line 193: Line 20: Phone ‘AA’ is mising in the acoustic model; word ‘R’ ignored
```
What is happening is that you have these words transcribed phonetically with English-language phonemes like AY and EH (probably via the fallback method), but those phonemes are not present in your acoustic model so the words have to be excluded by Pocketsphinx. Rather than generating these dictionaries dynamically you will have to create them by hand. The .dic file that comes with the Spanish acoustic model ought to have the correct phonetic transcriptions for its words. It is possible to use the DMP file that LanguageModelGenerator generates, since that is a probability model and doesn’t directly interact with the acoustic model.
January 10, 2013 at 2:58 pm in reply to: #1015295

Halle Winkler
Politepix

OK, go ahead and post the complete log rather than an excerpt and we’ll see what it says. The error you posted is weird because it seems to think that mainBundle is the folder above the sample app (../OpenEarsSampleApp.app) and I doubt it should be referencing that path using OpenEarsSampleApp as its reference point at all, let alone a directory that is outside of the sandbox. Have you made any changes to any part of the code relating to that path?

January 10, 2013 at 2:37 pm in reply to: #1015293

Halle Winkler
Politepix

Let’s take this one at a time. We can troubleshoot the mdef first since it is going to cause a crash. How did you verify that there is a file called mdef in the bundle? It has to be in the root of the bundle so something to check is whether it is actually inside of a folder in mainBundle.

January 10, 2013 at 1:29 pm in reply to: #1015291

Halle Winkler
Politepix

OK, my curiosity got the better of me and I checked out the model myself. Do you get the same error if you rename the mixture_weights file to sendump?

January 10, 2013 at 1:26 pm in reply to: #1015290

Halle Winkler
Politepix

Hi Guntis, welcome.

That error is due to the missing sendump you mentioned. I’m pretty sure that Pocketsphinx requires that sendump file so I’m sort of confused about why the Russian model doesn’t have one, but there must be a reason because nsh is the main Pocketsphinx developer so it isn’t a mistake. My suggestions for proceeding with this are as follows: the first step is to level-set and make sure that this works with Pocketsphinx outside of OpenEars (i.e. rule out, or discover, an issue with OpenEars that is causing this). Do you have a Linux VM or dedicated box you can install Pocketsphinx on and test the model? My other advice or an alternative step you might want to take (depending on how complicated it is for you to test on Linux) would be to ask Nickolay whether the sendump file is required and/or how to use the model with Pocketsphinx with no sendump, which you can do at the CMU Sphinx forums (or he might pop in and answer your SO question since he also follows the OpenEars tag): http://sourceforge.net/p/cmusphinx/discussion/help/

Something else to keep in mind is that if the DMP/dic files are large-vocabulary recognition files (i.e. contain a vocabulary that is large enough for general dictation tasks, with tens of thousands of words) they will be too big for offline speech recognition on a handheld device.

December 28, 2012 at 12:41 pm in reply to: Enabling Bluetooth Support #15171

Halle Winkler
Politepix

Thanks for the offer, I’d love to borrow the device for a few days of testing but I think we might be impractically distant — I’ll get in touch though, maybe we can figure something out.

December 28, 2012 at 11:57 am in reply to: Enabling Bluetooth Support #15169

Halle Winkler
Politepix

Great, thank you for this info. I will see if I can get a hold of one of those devices.

December 28, 2012 at 11:07 am in reply to: Enabling Bluetooth Support #15166

Halle Winkler
Politepix

Welcome,

Sorry you’re seeing an issue. The reason that I’ve labeled the bluetooth support experimental is that I can’t test against every device, so I do think it’s possible that it can happen that a particular device has quirks. The good news is that so far, this is actually the first time a developer has reported an issue with a particular bluetooth device in combination with the sample app since I added bluetooth support a year ago, so please treat it as a bug report and show me the full logging output from the sample app if it manifests the same issue and let me know the device, and if the opportunity to test and fix it comes up I will do so. I’d appreciate getting to see the full OpenEarsLogging output and the verbosePocketSphinx output since that will tell the whole story.

It does sound like there could be a general configuration issue in your own app if it performs notable differently from the sample app — I would investigate whether your app makes changes to the audio session either by calls to AVAudioSession or lower-level audiosession calls since that is the most likely way for an app to change OpenEars’ audio handling. Something else that will change the audio session is using certain media objects such as video players or some audio players. The last thing that I think might lead to issues is if you are doing anything that might override OpenEars’ threading behavior.

You can always access the source code and change it and recompile it for your own app — the framework source is right in the distribution in the OpenEars folder.

December 28, 2012 at 10:55 am in reply to: Error while integrating Neatspeech #15165

Halle Winkler
Politepix

Welcome Ravi,

This can happen if you didn’t add the -ObjC other linker flag or if the voices weren’t added to your target when you imported the voices folder, usually due to something going wrong with this step:

“In order to use NeatSpeech, as well as importing the framework into your OpenEars-enabled project, it is also necessary to import the voices and voice data files by dragging the “Voice” folder in the disk image into your app project. Make sure that in Xcode’s “Add” dialog, “Create groups for any added folders” is selected. Make sure that “Create folder references for any added folders” is not selected or your app will not work.”

Also make sure that your app target is checked in that “Add file” dialog so the items which are being added are also being added to your target.

The error just means that the code for the NeatSpeechVoice “Emma” is not available to your project.

December 26, 2012 at 3:15 pm in reply to: Knowing the present string being spoken by TTS #15133

Halle Winkler
Politepix

OK, that makes sense, but there is no callback which identifies the word that is being spoken because the entire phase is synthesized at once when you use FliteController. The best workaround would be to send a series of very short statements, highlighting the statement as you send it.

December 26, 2012 at 2:58 pm in reply to: Knowing the present string being spoken by TTS #15131

Halle Winkler
Politepix

Welcome,

Isn’t it just the string that you have entered into FliteController’s say:withVoice: method? Maybe I’m not understanding your question 100%, can you elaborate on why you can’t use your own string that you entered into say:withVoice:?

December 22, 2012 at 7:45 pm in reply to: Delay pocketsphinxDidDetectSpeech #15085

Halle Winkler
Politepix

Sorry, that is something you’d need to troubleshoot further on your own. It could be that you aren’t instantiating it at the right point in the logical flow of the app, it could be that there is an issue in the logical flow of your app with where you are trying to initiate the vibration effect (similarly to with the original question in this topic) or it could be that soundMixing isn’t a fix for what you are trying to do. It isn’t a supported feature so regretfully there is a time issue for me with getting too deeply into exploring the different potential reasons it might not be working yet.

December 22, 2012 at 6:18 pm in reply to: Delay pocketsphinxDidDetectSpeech #15083

Halle Winkler
Politepix

OK, glad that was helpful. You don’t actually have to modify the framework in order to turn on sound mixing, it is not currently part of the public API and therefore likely to change in future versions but for the time being you can turn on sound mixing simply by including the line you referenced above right before you do startListeningWithLanguageModelAtPath:. It might be necessary for you to import AudioSessionManager.h in the view controller from which you want to do that.

December 22, 2012 at 5:32 pm in reply to: Delay pocketsphinxDidDetectSpeech #15081

Halle Winkler
Politepix

OK, can you explain to me a little more about why you can’t suspend recognition at the time that you start the playback of your own AVAudioPlayer and resume it when you receive the delegate callback that your own AVAudioPlayer has completed playback? It doesn’t yet make sense to me why you’d need to suspend for an arbitrary period of time when you know the moment that you can suspend and the moment that you can resume in order to not have recognition in progress during your sound playback.

December 22, 2012 at 3:37 pm in reply to: Delay pocketsphinxDidDetectSpeech #15078

Halle Winkler
Politepix

Welcome,

pocketsphinxDidDetectSpeech is a delegate method, so you don’t want to delay it since you don’t call it directly, you just want to address the underlying functionality that you control directly which is whether recognition is engaged or not. Suspending recognition before playing your sound and resuming it afterwards should work perfectly for the goal of halting recognition during other media playback, so if it isn’t working perfectly we should figure out why. What happens when you suspend before you play your sound back and resume after your sound is done playing?

December 17, 2012 at 7:43 pm in reply to: Double commands in hypothesis #14932

Halle Winkler
Politepix

OK, thanks for the logging. I’ve never received a report of this issue before (not that I don’t believe it, just that it’s not a common issue) and the OS and device you’re using is part of the testbed, so I think I’d want to check out the app code in order to learn more about what is happening — would it be possible for you to make a stripped-down sample app that manifests the issue to send me so I can see it?

December 17, 2012 at 3:41 pm in reply to: New version Flite issue #14929

Halle Winkler
Politepix

Heh, I was just coming in here to see if I could find the old guide to definitively removing the old version somewhere, when you posted that you sorted it out yourself :) . Nice work and thank you for updating me.

December 16, 2012 at 9:42 am in reply to: RapidEars 'PocketsphinxController setRapidEarsToVerbose' error #14922

Halle Winkler
Politepix

OK, that shouldn’t cause an issue as long as you are positive that you are linking to the demo framework from the non-licensed apps and not the licensed framework.

Are you positive that you set the -ObjC “Other Linker Flag” in the target of the app that is having the issue? It looks like the RapidEars demo was somehow just not quite successfully installed by the exact steps in the tutorial. That is usually the step that would cause a method that is in the plugin to not work.

December 15, 2012 at 7:54 pm in reply to: RapidEars 'PocketsphinxController setRapidEarsToVerbose' error #14916

Halle Winkler
Politepix

OK, could you tell me what the versions are that are shown on the front page of the pdf of the OpenEars documentation and the pdf of the RapidEars documentation? The Info.plist isn’t used for frameworks.

Quick question, didn’t you have a working install of RapidEars previously? Just checking if something has changed in your setup.

https://www.politepix.com/forums/topic/problem-switching-between-openears-and-rapidears/

December 15, 2012 at 6:04 pm in reply to: RapidEars 'PocketsphinxController setRapidEarsToVerbose' error #14914

Halle Winkler
Politepix

Hi Matt,

Which version of RapidEars and which version of OpenEars? Do other methods of RapidEars work? You can find version numbers in the included documentation with both downloads.

December 14, 2012 at 10:28 pm in reply to: Optimizing open ears for single word recognition #14908

Halle Winkler
Politepix

This is generally due to speaking too far away from the built-in device mic. Its optimal distance is telephoning distance so if the device is far away you won’t get as good results as with the headset mic.

What exactly is happening when the recognition is wrong, is it something like the kid said “cat” but it recognized “hat”, where both “cat” and “hat” are words that are in the language model, or more like the kid said something unrelated but it was recognized as either “cat” or “hat”?

The best advice I can give is to optimize a language model so it doesn’t have a lot of very similar-sounding short words in it, because that is the most challenging circumstance to get right. In that case I might want to use smaller language models and maybe try and see if Rejecto handles rejecting out of vocabulary speech (that’s only helpful if you are getting recognitions due to words which aren’t in the language model at all).

I will take the request about having an option for putting the language models elsewhere under advisement for the next version of OpenEars, you make a good point.

December 14, 2012 at 4:39 pm in reply to: Error when integrating the NeatSpeech demo #14906

Halle Winkler
Politepix

Fantastic! Glad it’s working for you and enjoy the party.

December 14, 2012 at 3:42 pm in reply to: Error when integrating the NeatSpeech demo #14904

Halle Winkler
Politepix

by the way, the NeatSpeech voices are really good compared to the free ones…

And thanks for this! Very nice to hear.

December 14, 2012 at 3:35 pm in reply to: Error when integrating the NeatSpeech demo #14903
Halle Winkler
Politepix
This would be the lazy instantiation approach:

1. Make sure you’ve imported FliteController+NeatSpeech.h in the VC header after the import of FliteController.h,
2. Create an ivar and property of the voice and of the FliteController in the VC header, synthesize both in the VC implementation, and for each, override their accessor method with the following lazy accessors:
```
- (Emma *)emma {
	if (emma == nil) {
		emma = [[Emma alloc]initWithPitch:0.0 speed:0.0 transform:0.0];
	}
	return emma;
}

- (FliteController *)fliteController {
	if (fliteController == nil) {
		fliteController = [[FliteController alloc] init];
        
	}
	return fliteController;
}
```
Then, you don’t initialize either ever, or do any checking of whether they are instantiated, and you don’t have to queue, you just reference them like so:

[self.fliteController sayWithNeatSpeech:@”I have always wished for my computer to be as easy to use as my telephone; my wish has come true because I can no longer figure out how to use my telephone.” withVoice:self.emma];

Also, just for sanity, double-check that you’ve added the -ObjC other linker flag to the target.
December 14, 2012 at 3:30 pm in reply to: Error when integrating the NeatSpeech demo #14902

Halle Winkler
Politepix

Any VC. They can also be instantiated in a model that is controlled in a VC without any multithreading; the only reason I say to put them in a VC is that they should be on mainThread and not in a singleton but instead something which has a particular location in the view hierarchy and is normally memory managed.

How are you struggling? Are you instantiating the voice and the fliteController in the emma and fliteController lazy instantiation method that is shown in the tutorial and then referencing them with self. as in:

[self.fliteController sayWithNeatSpeech:@”I have always wished for my computer to be as easy to use as my telephone; my wish has come true because I can no longer figure out how to use my telephone.” withVoice:self.emma];

?

I’m here to help, just let me know what the hangup is and I’m sure we can figure it out.

December 13, 2012 at 4:25 pm in reply to: Error when integrating the NeatSpeech demo #14896

Halle Winkler
Politepix

Just to explain a bit more about the internal queueing, you can send text to sayWithNeatSpeech: whenever you want, and if speech is currently in progress the new text will be queued behind the scenes and spoken when previous queued speech is done. Or you can send a single very large piece of text and NeatSpeech will break it down and queue it up on its own. You can also dump the queue. It’s built on the assumption that you will need to queue and manages its whole process of putting synthesis on a secondary thread and keeping the results that are delivered by OpenEarsEventsObserver on mainThread.

December 13, 2012 at 3:39 pm in reply to: Error when integrating the NeatSpeech demo #14895

Halle Winkler
Politepix

OK, I see a few issues. The first is that the tutorial gives an example of how to do the memory management for both FliteController and FliteController+Neatspeech voices, and it’s a good idea to use it since it avoids issues related to memory management. This looks like the initialization occurs inside of an instance method of a shared object, which seems like there are a few ways it could be going wrong. There’s no need to put NeatSpeech inside a singleton or do something with queueing since NeatSpeech manages its own queue internally and it is multithreaded and expects to be instantiated in one view controller, not in a singleton whose thread we don’t know.

I would just set it up like the tutorial example:

https://www.politepix.com/openears/tutorial

December 13, 2012 at 3:06 pm in reply to: Error when integrating the NeatSpeech demo #14893

Halle Winkler
Politepix

You can also contact me through the contact form and I’ll give you an address to email your code or project to if you want to use your free support email.

December 13, 2012 at 3:05 pm in reply to: Error when integrating the NeatSpeech demo #14892

Halle Winkler
Politepix

Hi,

Can you show the code you used? It just sounds a bit like the Emma voice (or whichever voice) is not instantiated at the time you are calling it.

December 10, 2012 at 6:51 pm in reply to: The pocketsphinxDidReceiveHypothesis is never fired #14844

Halle Winkler
Politepix

That’s great! I’m happy I could help.

December 10, 2012 at 6:44 pm in reply to: The pocketsphinxDidReceiveHypothesis is never fired #14842

Halle Winkler
Politepix

Oh, I just noticed this from your question — there is no public method called startVoiceRecognitionThread, so calling a method with this name is probably the issue. If you are doing anything with PocketsphinxController’s threading it will probably cause issues since PocketsphinxController handles its own multithreading. Maybe the best approach is to do a new installation based on the tutorial: https://www.politepix.com/openears/tutorial

December 10, 2012 at 6:41 pm in reply to: The pocketsphinxDidReceiveHypothesis is never fired #14839

Halle Winkler
Politepix

Hmm, this is known working without any issues, so I think it’s just going to turn out to be OpenEarsEventsObserver delegate method connection issue.

December 8, 2012 at 10:13 am in reply to: Recording OpenEars Audio Input to File #14833

Halle Winkler
Politepix

The underscore is part of the linker reporting, it isn’t related to the binary which definitely works with the current version of OpenEars.

This is an normal issue with installing the plugin — can you make sure that you’ve followed all of the steps in the tutorial at https://www.politepix.com/openears/tutorial including adding the -ObjC linker flag in the right place and making sure that your project isn’t still linked to an old version of OpenEars?

December 7, 2012 at 2:00 pm in reply to: Rejecto – LanguageModel #14824

Halle Winkler
Politepix

Hiya and welcome back,

I can take this under advisement as a requested feature, but the main thing Rejecto does is create a language model that incorporates the rejection features and has the rejecting elements added to the language model’s probability calculation, so there is no getting around the requirement to recalculate the lm’s probability model even if you start with a completed one. It also has to check and make sure that you aren’t already using one of the rejection phonemes in your real model and if so remove that phoneme from the rejection model, meaning that a premade .dic would also still need processing.

Not saying there is no way, just that it isn’t trivial and it won’t vastly cut down on processing time.

If you are seeing unpleasantly slow generation for a dynamic model, maybe you want to look at this tip I wrote up which had a suggestion at the end for avoiding any repeated use of the fallback pronunciation generation technique (i.e. the slow one):

https://www.politepix.com/2012/12/04/openears-tips-and-tricks-5-customizing-the-master-phonetic-dictionary-or-using-a-new-one/

December 4, 2012 at 10:58 am in reply to: OpenEars on Mac #14636

Halle Winkler
Politepix

Welcome,

Unfortunately it can’t because the audio driver is extremely adapted to iOS audio, and I haven’t ported it because I think dealing with all of the possible variations in OS X desktop audio would be a huge support job but for a much smaller userbase, so basically not a good fit for a project such as this one. Sorry I can’t help you with that, pf.

December 2, 2012 at 7:00 pm in reply to: NeatSpeech Problem #14431

Halle Winkler
Politepix

OK, if the files are in there, there shouldn’t be an error that the files can’t be found, so why don’t you send the unhappy project over with any private stuff stripped out and I’ll investigate.

December 2, 2012 at 6:41 pm in reply to: NeatSpeech Problem #14429

Halle Winkler
Politepix

OK, the app bundle should not have folders in it called Voices or VoiceData. If the app bundle has these folders in it, the radio button selection in the “Add” dialog box at the time of importing the voice files is definitely on the wrong setting and the app will not be able to use NeatSpeech voices — you will get the exact error that you received.

The only thing you should see in the app bundle as a result of adding NeatSpeech are the loose files that can be found within the folder VoiceData (but not the folder itself) at the root level of the app bundle. The folder that says Voices has no purpose inside the app bundle since it just contains frameworks which get compiled directly into the app product binary. You should see groups called Voices and VoiceData in your project file navigator, but never in your app bundle.

Just remove the added folders from your project and add them again with the correct settings in the dialog box, exactly as they are described in the tutorial. This is the important line from the tutorial:

Make sure that in Xcode’s “Add” dialog, “Create groups for any added folders” is selected. Make sure that “Create folder references for any added folders” is not selected or your app will not work.

“Create folder references” is selected in your add dialog currently — Xcode only has a single way of creating subfolders within an app bundle, and that radio button selection is it, so it’s the cause of the issue.

SLT will work fine because it doesn’t rely on any data files that need to be found in the app bundle.

If you want, you’re welcome to email me your project (with the classes and resources stripped out) and I can double-check what the issue is. If you want to do that, just send me a note via the contact form and I’ll send you the email address.

December 2, 2012 at 5:54 pm in reply to: NeatSpeech Problem #14426

Halle Winkler
Politepix

What do you see when you look at the inside of your app bundle?

December 2, 2012 at 5:20 pm in reply to: NeatSpeech Problem #14424

Halle Winkler
Politepix

You can definitively verify whether this is the issue by selecting your app product under the Products group in the file navigator, right-clicking on it, and selecting show in finder. When you see the app product in the finder, you can right-click on it and request viewing the package contents. Once you are at the root level of the bundle (where your image files are added, etc), if the files were successfully added they will be visible in the app bundle at root level. If they aren’t in there at all, they weren’t added to the app target. If they are in there but they aren’t in the root level (they are inside of a folder that is at the root level) that means that when they are added, the add dialog box settings are incorrect.

I suppose one last option is that it is possible that you are dragging the folder containing the voice data files into a group that already represents a folder within your app bundle. See if you get better results from dragging the folder in to the file navigator right into the project file icon rather than a subfolder.

December 2, 2012 at 9:27 am in reply to: NeatSpeech Problem #14422

Halle Winkler
Politepix

Welcome,

So far, this kind of error has always been due to the wrong settings on the “Add” import dialog such as the ones referenced in this similar issue with acoustic model resources being added or not added:

https://www.politepix.com/openears/support/#Q_My_app_crashes_when_listening_starts

Basically, the voice resources were not successfully added to your more complex project in the expected location in the bundle, but they were successfully added to the brand-new project in the expected location in the bundle. In the more complex project they are probably in a subfolder inside the app bundle where NeatSpeech can’t find them because they are expected to be at root level. This difference is a result of the setting mentioned here:

Make absolutely sure that in the add dialog “Create groups for any added folders” is selected and NOT “Create folder references for any added folders”

December 1, 2012 at 10:41 pm in reply to: Project can find the OpenEars framework headers, but not Slt :-( #14270

Halle Winkler
Politepix

Yeah, I suppose that somehow or other it has to be a project settings issue if the tutorial method works but it doesn’t work in the existing project. Glad to hear you are getting better results with a new project and I hope it continues to go smoothly.

December 1, 2012 at 10:06 pm in reply to: Project can find the OpenEars framework headers, but not Slt :-( #14191

Halle Winkler
Politepix

Hi Matthew,

Generally this only happens if the settings in the “Add files” dialog when doing the import are wrong with regard to these instructions:

Make absolutely sure that in the add dialog “Create groups for any added folders” is selected and NOT “Create folder references for any added folders” because the wrong setting here will prevent your app from working.

The other possibility is that the version of OpenEars in your existing app is an old version, or the path to the framework in the search path leads to an old version.

Otherwise, it’s always a good step to clean the project, and to quit and restart Xcode. Let me know if any of this helps.

December 1, 2012 at 8:56 pm in reply to: Playing a pre-recorded sound before synthesized speech. #14189

Halle Winkler
Politepix

Hello,

The ‘audioPlayerDidFinishPlaying:successfully:’ would only allow audio to be inserted afterwards.

Only if you start playing the AVAudioPlayer sound after the speech returns the OpenEarsEventsObserver method. If you initiate the speech as a result of the audioPlayerDidFinishPlaying:successfully: method of a sound that you play using AVAudioPlayer being called, the sound will precede the speech.

December 1, 2012 at 10:43 am in reply to: Playing a pre-recorded sound before synthesized speech. #14144

Halle Winkler
Politepix

Hello,

You can play a sound whenever you like using standard AVAudioPlayer methods and their delegates. Just initiate speech once the AVAudioPlayer delegate method audioPlayerDidFinishPlaying:successfully: returns.

November 29, 2012 at 10:52 pm in reply to: Double commands in hypothesis #13839

Halle Winkler
Politepix

That’s funny, thanks for letting me know. It’s a big hint about what the underlying issue might be and I’m glad you have a workaround for now. When you get the time to send me the full log output I will see if I can track down the issue.

November 29, 2012 at 9:47 pm in reply to: Double commands in hypothesis #13835

Halle Winkler
Politepix

Also let me know which mic you are using when you’re getting these results and how far you are from the device, thanks.

November 29, 2012 at 9:36 pm in reply to: Double commands in hypothesis #13834

Halle Winkler
Politepix

OK, that sounds a bit buggy. It’s possibly an iPad 3 issue with the cmninit value that we could probably fix right now. Your code looks reasonable to me.

Can I ask you to turn on verbosePocketsphinx and verboseLanguageModelGenerator and OpenEarsLogging and then print the log here? I’d like to see a log for 5 recognition rounds and it would be great if you would separately tell me what you really said.

November 29, 2012 at 9:13 pm in reply to: Double commands in hypothesis #13832

Halle Winkler
Politepix

Oh, another question — you posted this in the OpenEars plugins section, but from the info in the question I’ve been assuming that the question is actually about OpenEars without a plugin. Is this incorrect and the question is about one of the OpenEars plugins, or is it just about OpenEars itself?

November 29, 2012 at 9:11 pm in reply to: Double commands in hypothesis #13831

Halle Winkler
Politepix

OK, is it happening on the first hypothesis of the session or does it also happen afterwards?

November 29, 2012 at 9:00 pm in reply to: Double commands in hypothesis #13829

Halle Winkler
Politepix

Any chance you’re testing on the Simulator?

November 27, 2012 at 11:40 am in reply to: ConvertInput error in pocketsphinxDidReceiveHypothesis #13608

Halle Winkler
Politepix

Cool, I’m glad to hear you’ve seen an improvement and I appreciate your updating the thread.

November 26, 2012 at 6:18 pm in reply to: Timing of Open Ears Word Recognition #13547

Halle Winkler
Politepix

Hi Matt,

OpenEars uses pause-based continuous recognition, so it always has to wait for a half-second (or so) pause before it knows it can perform recognition on the entire utterance. RapidEars is a plugin for OpenEars which can do realtime recognition in which the speech is analyzed at the same time it enters the microphone with the least latency before returning results that is allowed by the speed of the device CPU.
Author

Posts

Viewing 100 posts - 1,801 through 1,900 (of 2,171 total)

← 1 2 3 … 18 19 20 21 22 →