Tagged: flite, Flitecontroller, multithreading, NSThread, pitch, speech, speed, thread, variance
This topic has 4 voices, contains 17 replies, and was last updated by oakdemirci 286 days ago.
| Author | Posts |
|---|---|
| Author | Posts |
| July 24, 2011 at 7:40 am #7371 | |
|
sarinsukumar |
Hi halle, |
| July 24, 2011 at 9:08 pm #7375 | |
|
Halle |
Identical to the stock code but the synthesis occurs on a background thread so it doesn’t block mainThread. |
| July 26, 2011 at 10:27 am #7377 | |
|
marco |
Hi halle, |
| July 26, 2011 at 10:33 am #7378 | |
|
Halle |
Hi Marco, Two seconds to speak a single number seems awfully long, can you tell me the voice, OS and device that is with? If you turn on OPENEARSLOGGING it will actually time the voice synthesis for you, so you can see what the exact number is to tell me. |
| July 26, 2011 at 10:44 am #7380 | |
|
Halle |
Another question: are you creating a new FliteController for every shout, or are you using the recommended memory management from http://www.politepix.com/openears/yourapp in the part which begins “The last convention is that when the instructions say to instantiate an object”? Because if you are initializing the entire controller each time that will really add some setup time. |
| July 26, 2011 at 10:58 am #7382 | |
|
marco |
Hi Halle, Didn’t expect you to reply so fast! awesome. Anyway, the voice i’m using is “cmu_us_slt” as it is the “clearest” and “sexiest” voice IMO, OS is iOS 4.3.3, device is iPhone 4. I enabled the OPENEARSLOGGING and it takes an average of 0.745 seconds to run flite. I’m using the same FliteController for every shout. I did however made an IF statement in the delegate method, “fliteDidFinishSpeaking”, checking if the number is still greater than zero, then it would decrement and shout the current number until zero. |
| July 26, 2011 at 11:05 am #7383 | |
|
Halle |
OK, if I’m not mistaken, that is the most complex voice on offer, so you are seeing slightly better than realtime performance which may be the best we can hope for. The 8k voices will perform better but they won’t sound as good. If you are always using the voices to count down, you could actually not use voice synthesis but pre-recorded audio clips since you know the entire set of required statements in advance. |
| July 26, 2011 at 11:14 am #7384 | |
|
marco |
Wow. i never thought of using pre-recorded audio clips. Thank you for the tip! The reason why I’m using this is part of my app also uses the voice recognition. So i figured I would just use the fliteController to do TTS because it’s already part of the library. Thanks again for the tip! |
| July 26, 2011 at 11:21 am #7385 | |
|
Halle |
No problem, glad to hear that’s an option for you. |
| July 26, 2011 at 11:37 am #7386 | |
|
sarinsukumar |
Hi halle, |
| July 26, 2011 at 2:39 pm #7388 | |
|
Halle |
You can just add another audio unit driver that uses the same buffer size as the existing audio unit driver, that queues up the streamed data and plays it back through the callback. You’ll need to do some testing to find the different amounts that need to be buffered before playback on the different devices and with the different voices so that they don’t skip. |
| July 26, 2011 at 2:58 pm #7391 | |
|
sarinsukumar |
hi halle, |
| July 26, 2011 at 2:59 pm #7392 | |
|
Halle |
Nope, I mean audio unit but you should code your driver in whatever technology you are already proficient in. There are many different approaches for playback from a buffer (FliteController already uses one, it just isn’t a buffer whose contents change over time so there is no need for a driver that needs to interface with the Flite streaming functionality). |
| August 6, 2011 at 2:19 pm #7466 | |
|
oakdemirci |
Hi Halle, really thanks for your effort and OpenEars. I’m trying to develop a tts application with OpenEars. As Sarin and Marco, I have some problem with performance of cmu_us_slt voice. It takes “some” time to synt. the voice. Thanks in advance. 2011-08-06 16:01:13.546 MyApp[107:707] OPENEARSLOGGING: Flite interrupting existing talking if necessary.
|
| August 6, 2011 at 2:48 pm #7468 | |
|
sarinsukumar |
Hi, |
| August 6, 2011 at 2:51 pm #7469 | |
|
Halle |
Hi oakdemirci, I can only tell you what I told marco and sarinsukumar — you are using the most complex voice and if it performs better than realtime you’re getting good results. Expected behavior for those 16k voices is that they will take more time to synthesize speech than the actual duration of the speech waveform, so they are not recommended for slow devices or very long statements. There isn’t actually anything to document with the multithreading, it’s a drop-in replacement for the files and the replacement files are well commented. |
| August 6, 2011 at 2:53 pm #7470 | |
|
Halle |
True, but it will prevent the synthesis from blocking the app, which is an issue for a synthesis time that is longer than a fraction of a second. |
| August 6, 2011 at 3:10 pm #7471 | |
|
oakdemirci |
Ok, by multi threading I thought that, I have to implement different flite instances, and manage them accordingly, not just file replacement. I have already replaced the filed but it did not differ, as you said. Anyways thank you both. =) |
You must be logged in to reply to this topic.

OpenEars
Our Flying Friends