Forum Replies Created
-
AuthorPosts
-
Halle WinklerPolitepix
This is because there are multiple things about this which are a problem for ideal recognition with these tools: it has high uncertainty because it is a different language, and language models aren’t designed to work with a single word. I expect changing the weight to affect this outcome, but if it doesn’t, that is the answer on whether this approach will work.
Halle WinklerPolitepixHave we ever seen a fully-working result from your original grammar implementation without a plugin since we fixed the grammar?
Halle WinklerPolitepixI’ve recommended what is possible to tune for Rejecto, there is nothing else. If it isn’t doing well yet, this is likely to just be due to it being a different language. You can also tune vadThreshold but I recommended doing that at the start so I am assuming it is correct now.
Halle WinklerPolitepixYeah, that makes a certain amount of sense because this use case is very borderline for RuleORama – it isn’t usually great with a rule that has a single entry and the other elements of this which are pushing the limits of what is likely to work are probably making it worse. We can shelve the investigation of RuleORama now that we have seen a result from it.
Halle WinklerPolitepixThe first thing in this RuleORama implementation to fix is again that this:
OEPocketsphinxController.sharedInstance().startListeningWithLanguageModel(atPath: lmPath, dictionaryAtPath: dictPath, acousticModelAtPath: OEAcousticModel.path(toModel: accusticModelName), languageModelIsJSGF: true)
needs to be this:
OEPocketsphinxController.sharedInstance().startListeningWithLanguageModel(atPath: lmPath, dictionaryAtPath: dictPath, acousticModelAtPath: OEAcousticModel.path(toModel: accusticModelName), languageModelIsJSGF: false)
There may be other issues but let’s start there.
Halle WinklerPolitepixRegarding your Rejecto results: you can now pick whichever one of them is better and experiment with raising or reducing the value withWeight in this line (lowest possible value is 0.1 and largest possible value is 1.9):
let err: Error! = lmGenerator.generateRejectingLanguageModel(from: words, withFilesNamed: fileName, withOptionalExclusions: nil, usingVowelsOnly: false, withWeight: 1.0, forAcousticModelAtPath: OEAcousticModel.path(toModel: accusticModelName))
What does the symbol “@” represent in the LookupList.text ? (the double-ee’s and double-ii’s I can somehow intereprete but what does “@” really mean ?)
It represents the phone sound in Hochdeutsch which is represented by the IPA ə. This is getting outside of the things I support here but there should be enough info in that explanation for you to find sources outside of these forums to continue your investigation if you continue to have questions.
- This reply was modified 5 years, 7 months ago by Halle Winkler.
Halle WinklerPolitepixThis:
lmPath = lmGenerator.pathToSuccessfullyGeneratedGrammar(withRequestedName: fileName)
needs to be:
lmPath = lmGenerator.pathToSuccessfullyGeneratedLanguageModel(withRequestedName: fileName)
Halle WinklerPolitepixIf you want to show me more logs from this implementation, make sure to show me the now-changed code again as well.
Halle WinklerPolitepixHi,
That is happening because this code is a mixed example of an earlier grammar implementation and a later Rejecto implementation. Please change this:
OEPocketsphinxController.sharedInstance().startListeningWithLanguageModel(atPath: lmPath, dictionaryAtPath: dictPath, acousticModelAtPath: OEAcousticModel.path(toModel: accusticModelName), languageModelIsJSGF: true)
to this:
OEPocketsphinxController.sharedInstance().startListeningWithLanguageModel(atPath: lmPath, dictionaryAtPath: dictPath, acousticModelAtPath: OEAcousticModel.path(toModel: accusticModelName), languageModelIsJSGF: false)
and also please change this vowels option which looks like it must be left over from some previous round of experimentation and will harm accuracy:
let err: Error! = lmGenerator.generateRejectingLanguageModel(from: words, withFilesNamed: fileName, withOptionalExclusions: nil, usingVowelsOnly: true, withWeight: 1.0, forAcousticModelAtPath: OEAcousticModel.path(toModel: accusticModelName))
to this:
let err: Error! = lmGenerator.generateRejectingLanguageModel(from: words, withFilesNamed: fileName, withOptionalExclusions: nil, usingVowelsOnly: false, withWeight: 1.0, forAcousticModelAtPath: OEAcousticModel.path(toModel: accusticModelName))
- This reply was modified 5 years, 7 months ago by Halle Winkler.
- This reply was modified 5 years, 7 months ago by Halle Winkler.
Halle WinklerPolitepixIf you want to collapse the logs so that they aren’t as visually big, you can put spoiler tags around them:
[spoiler]
[/spoiler]
this will make them possible to open and close so they don’t take up vertical space if it bothers you.- This reply was modified 5 years, 7 months ago by Halle Winkler.
Halle WinklerPolitepixPaste the logs and VC contents in this forum, thank you. There are many other discussions here with big logs and they provide a way for searchers to get hits for specific errors and problems they are troubleshooting without my having to answer the same questions many times, as well as for me to be able to go back and find either bugs or points of confusion. When that is all hidden away in a repo they will eventually disappear as the repo changes or is removed, or cause the support request to occur in that repo, and won’t help anyone solve their problems where it is possible for them to follow on with a “I got the same log result but your fix isn’t affecting my case”. It’s a very important part of there being visibility for solutions.
- This reply was modified 5 years, 7 months ago by Halle Winkler.
Halle WinklerPolitepixPlease put all your documentation of what is going on in this forum, thank you. The Github repo will change or disappear (it has already disappeared and then returned with different content in the course of just this discussion, so there is a previous link to it which is already out of date) and as a consequence make this discussion no use for anyone who has a similar issue to any of the many questions you are asking for information about.
Halle WinklerPolitepixThey are being marked as spam due to the multiple external links. Please keep all the discussion in here so it is a useful resource to other people with the same issue. I recommend doing this without all the confusion and complexity by returning to the premise of troubleshooting exactly one case at a time. You can choose which to begin with.
Halle WinklerPolitepixOK, let’s see what happens when you make the following changes to the three projects.
For your regular grammar project and for your RuleORama project, please adjust this code:
let words = ["esch do no frey"] // let err: Error! = lmGenerator.generateLanguageModel(from: words, withFilesNamed: name, forAcousticModelAtPath: OEAcousticModel.path(toModel: accusticModelName)) // let err: Error! = lmGenerator.generateGrammar(from: [OneOfTheseWillBeSaidOnce : words], withFilesNamed: fileName, forAcousticModelAtPath: OEAcousticModel.path(toModel: accusticModelName)) let err: Error! = lmGenerator.generateFastGrammar(from: [OneOfTheseWillBeSaidOnce : words], withFilesNamed: fileName, forAcousticModelAtPath: OEAcousticModel.path(toModel: accusticModelName))
so it matches the grammar instructions with the enclosing ThisWillBeSaidOnce declaration like so:
let words = ["esch do no frey"] let grammar = [ ThisWillBeSaidOnce : [ [ OneOfTheseWillBeSaidOnce : words] ] ] // let err: Error! = lmGenerator.generateGrammar(from: grammar, withFilesNamed: fileName, forAcousticModelAtPath: OEAcousticModel.path(toModel: accusticModelName)) let err: Error! = lmGenerator.generateFastGrammar(from: grammar, withFilesNamed: fileName, forAcousticModelAtPath: OEAcousticModel.path(toModel: accusticModelName))
Uncommenting whichever of the generateGrammar/generateFastGrammar lines are to be used by the respective grammar project.
For your Rejecto project, please open AcousticModelGerman.bundle/LanguageModelGeneratorLookupList.text at whatever location you are really linking to it (please be ABSOLUTELY sure you are performing this change on the real acoustic model that your project links to and moves into your app bundle, wherever that is located, or our troubleshooting work on this project will be guaranteed to be unsuccessful) and look for the following two lines:
es ee s esf ee s f
and change them to this:
es ee s eschdonofrey @ ss d oo n oo f r @ ii esf ee s f
and then you have to change your Rejecto language model generation code (which you have never shown me here) so that it just creates a model for the single word “eschdonofrey”. Do not make this change to your grammar projects. For contrast, you can also try changing the acoustic model entry to this instead with your Rejecto project, with slightly different phonemes:
es ee s eschdonofrey ee ss d oo n oo f r ee ii esf ee s f
If none of these have better results, this will be the point at which we will have to stop the troubleshooting process, because it is guaranteed to get confused results if we try to further troubleshoot three different implementations in parallel which have hosted other different implementations at different times. If one of these projects has improved results, we can do a little bit more investigation of it, under the condition that the other two projects are put away and it is possible for me to rely on the fact that we are only investigating one clean project at a time moving forward. Let me know how it goes!
- This reply was modified 5 years, 7 months ago by Halle Winkler.
Halle WinklerPolitepixNo problem, just keep in mind that I asked for that project to have a clean slate to work from without mixing up code from multiple approaches, so we want to get back to that state of clarity and simplicity.
Halle WinklerPolitepixI’m talking about the project which uses this VC: https://www.politepix.com/forums/topic/recognize-short-command-in-nonenglish/#post-1032343
Except moving the logging calls high enough up so that we can see any errors that happen while you’re generating the grammar.
Halle WinklerPolitepixCool, thank you. Do you have a log for your earlier project that is just OpenEars using a grammar (without Rejecto), with the logging calls moved to the top? I thought that was the main file we had starting with debugging above, and then we were going to quickly try out RuleORama if you wanted. Those two grammar-using projects are the ones I’m curious about right now, because it looks like there is a flaw in the grammar and I want to know if the same error is happening in both.
Halle WinklerPolitepixHi,
A few things to clarify:
• it is of course completely fine if you don’t want to use RuleORama, which is the reason I asked first if it was OK for you. This is not an upsell – my question was because there is no easier time to try it out then right after you have set up a grammar, and if you wanted to hear all the options, this was the most convenient moment to explore that one and any other timing will be less convenient because we will be changing from a grammar to a language model afterwards. My intention was to explain to you how to add it to your existing project if you agreed to try it out. It is fine with me either to skip it or to take time to get it working.
• This is too unconstructive for me while I’m working hard to give you fast and helpful support for an unsupported language, and I’d appreciate it if you’d consider that we both have stresses in this process: “I relaize the RuleORama-demo is again not useful after download – and I feel that I loose trememdeous amount of time just to set up these demo-projects. Also, your manual contains ObjC-Code under the Swift3 chapter – which is not something pleasant either.” I disagree with you about the origin of the issues in this case, but more importantly, I just don’t want to proceed with this tone, which also seemed to come up due to my trying hard to remove all the unknown variables from our troubleshooting process, and I’m likely to choose to close the question if it is an ongoing thing even though we’ve both invested time in it. You don’t have to agree, but I don’t want you to be surprised if I close this discussion for this reason.
• I want to warn you here that it is possible there is no great solution because this is an unsupported language, so that you have enough info to know whether to invest more time. I am happy to help, and I have some theories about how we might be able to make this work, but not every question has a perfect answer.
That all said, I just noticed from your RuleORama install that there is something we need to fix in both installs, which is that in both cases the logging is being called too late to log the results of generating the language model or grammar. Can you move these:
OELogging.startOpenEarsLogging() //Uncomment to receive full OpenEars logging in case of any unexpected results. OEPocketsphinxController.sharedInstance().verbosePocketSphinx = true
To run right after super.viewDidLoad() and share the logging output from both projects?
Halle WinklerPolitepixSuper, we can test a couple of things now that we’ve level-set. First question: would it be possible for you to use RuleORama, or is Rejecto the only plugin you would like to test with? I’m asking because the easiest first step is to see if your results are better with RuleORama because you already have a working grammar, but if you don’t want to use it, we can skip it and try something with Rejecto.
Halle WinklerPolitepixI’d also like to take this opportunity to remind that even when we start discussing Rejecto- or RuleORama-related issues, the license of the plugins does not allow them to be redistributed anywhere, so make sure not to post the demos or licensed versions of those products anywhere like Github or anywhere else which enables sharing.
Halle WinklerPolitepixOK, can you now run this project and show me all of the logging output (with no edits at all) requested in this post: https://www.politepix.com/forums/topic/install-issues-and-their-solutions/ and then show me the code excerpt I asked for above which includes the generation of your language model? Thanks!
Halle WinklerPolitepixHi, sorry, I really don’t run projects locally for this kind of troubleshooting – I can only ask for your logging output and code excerpts as I have time to assist. I asked you to make a new project so we could be sure we were not mixing your old approaches because your old code was entering into the troubleshooting process, but I apologize if this was confusing about the question of whether I would be running the test project locally and debugging it myself.
Halle WinklerPolitepixHello,
Sorry, there is no email support for these kinds of issues, please keep all discussion here and using the debug tools that are possible to work with via these forums, thank you!
Halle WinklerPolitepixWelcome,
This should be possible to address by changing the vadThreshold. There is no ‘push to talk’ mode with OpenEars where listening is started and stopped by user input, sorry.
Halle WinklerPolitepixOK, I’m a bit worried that there is cross-contamination with your multiple troubleshooting approaches in the code you are working with and showing me (I’ve touched on this concern a couple of times in this discussion so far because it is a common situation after the developer has tried many things), so I think I’d like to try to rule that out before we continue.
Could you put away your existing work and zip it into an archive somewhere for the time being, and then make an entirely new project with the tutorial only using the approach we’ve chosen here (a grammar using stock OpenEars and the German acoustic model), and do it with four words from the start where you would be comfortable sharing all the vocabulary and listening initialization code with me and 100% of the logging output? Then we can continue without worrying about replacing words or old code hanging around. Thanks!
Halle WinklerPolitepixOK, it needs to say
words = ["esch da no fey"]
Halle WinklerPolitepixI would also need to see the content of the array words, even if each word has been substituted with another word.
Halle WinklerPolitepixOK, can you show me the shortest possible code excerpt (i.e. just the part where you create the grammar and start listening for it, making absolutely sure that none of your other troubleshooting attempts are still also in effect), replacing the four words with a different four words if you need to for confidentiality?
Halle WinklerPolitepixAnd also, what is the vadThreshold value for German ?
This always has to be tested out on your end of things for your usage case (doing this will also help with your background noise issue, with luck). I think there is a note about it at the bottom of the acoustic model page with more info.
Halle WinklerPolitepixAn array with a single element, which is a string containing four words with spaces between them.
Halle WinklerPolitepixGot it, thank you for explaining. OK, let’s first try and see whether the simplest thing works, and then we can explore other ways if it doesn’t. Let’s start with using a grammar with the German acoustic model and see how far we get. What are your results when you do that? You grammar should have the whole sentence as a single string as a single array item in your dictionary, not an array of individual words.
Halle WinklerPolitepixI see, the aspect where it sounds like a single word is a standard characteristic of spoken Schweizerdeutsch for a sentence this short and simple, is that correct?
Halle WinklerPolitepixSorry, it’s difficult for me to imagine this case without an example. Can you share with me something similar enough so I can understand how a four-word phrase can sound like a single word when spoken?
Halle WinklerPolitepixOK, well, let’s just pick a single Acoustic Model for this troubleshooting case so that we don’t have a lot of variables – you can tell me which one we’re using. I recommend the German one.
–> generateGrammar sais: “This will recognize exact phrases instead of probabilistically recognizing word combinations in any sequence.”
–> Rejecto’s doc sais: “Rejecto makes sure that your speech app does not attempt to recognize words which are not part of your vocabulary. This lets your app stick to listening for just the words it knows, and that makes your users happy.”
Yes, the intention of this documentation is to clarify that a grammar can listen for a multi-word phrase in exclusive terms (i.e. it won’t attempt to evaluate statistical nearness to your phrase but just try to hear it when complete, not hear it when not complete) and Rejecto will reject unknown words from a model made up of words. So if the goal is a sentence, a grammar is probably the right choice. If you were looking for one of several words by themselves, or phrases where you didn’t mind possible partial recognition of the phrase contents, Rejecto would be better.
Halle WinklerPolitepixWelcome,
Let’s just troubleshoot one case if that’s OK – two languages and two different vocabulary structures might get a little confusing. Would you rather troubleshoot your grammar, or the Rejecto model? BTW, maybe this goal would be a better match for the German acoustic model.
February 2, 2018 at 6:39 pm in reply to: rapidEarsDidReceiveLiveSpeechHypothesis not firing #1032231Halle WinklerPolitepixGlad to hear it. That fix should be part of the standard distributions of the plugins shortly, I just have to take some time to verify that it doesn’t affect pure Objective-C implementations negatively.
February 2, 2018 at 5:03 pm in reply to: rapidEarsDidReceiveLiveSpeechHypothesis not firing #1032229Halle WinklerPolitepixHi,
Can you test the following potential fix for this:
1. In the RapidEars framework’s header file OEEventsObserver+RapidEars.h, can you look for the line:
@interface OEEventsObserver (RapidEars) <OEEventsObserverDelegate>
And right after it, paste in the following lines:
@end @protocol OEEventsObserverRapidEarsDelegate <OEEventsObserverDelegate>
And then in your Swift class, where you have implemented this line importing the OEEventsObserverDelegate protocol (this may not be a view controller in your app; the class name and inheritance is unimportant but the importing of the delegate protocol OEEventsObserverDelegate is important):
class ViewController: UIViewController, OEEventsObserverDelegate {
Change the imported delegate protocols to this:
class ViewController: UIViewController, OEEventsObserverDelegate, OEEventsObserverRapidEarsDelegate {
and let me know if you are now receiving the results of the RapidEars delegate callbacks?
January 30, 2018 at 5:13 pm in reply to: rapidEarsDidReceiveLiveSpeechHypothesis not firing #1032227Halle WinklerPolitepixWhich Xcode and iOS version are you seeing this with, BTW?
January 30, 2018 at 5:12 pm in reply to: rapidEarsDidReceiveLiveSpeechHypothesis not firing #1032226Halle WinklerPolitepixThanks. I’ve checked and the function signature is definitely correct for Swift 4 as well as 3, so I’ll need to do some more investigation to see if there is anything new going on with this kind of extension in recent Xcodes. Thanks for your patience.
January 30, 2018 at 4:41 pm in reply to: rapidEarsDidReceiveLiveSpeechHypothesis not firing #1032224Halle WinklerPolitepixDo you get this result on a real device?
January 30, 2018 at 3:20 pm in reply to: rapidEarsDidReceiveLiveSpeechHypothesis not firing #1032222Halle WinklerPolitepixOK, I will take a look, thanks.
January 30, 2018 at 3:07 pm in reply to: rapidEarsDidReceiveLiveSpeechHypothesis not firing #1032220Halle WinklerPolitepixThanks – was this done by adding it to the sample app or from the tutorial? Can you also show me the OpenEarsLogging output? I think this is just the verbosePocketsphinx output.
January 30, 2018 at 9:57 am in reply to: rapidEarsDidReceiveLiveSpeechHypothesis not firing #1032218Halle WinklerPolitepixWelcome,
Sure, please check out the post Please read before you post – how to troubleshoot and provide logging info here so you can see how to turn on and share the logging that provides troubleshooting information for this kind of issue.
January 29, 2018 at 12:09 pm in reply to: Way to see OpenEars internal understanding of a sound #1032216Halle WinklerPolitepixWelcome,
No, sorry, I don’t see a way to do this.
Halle WinklerPolitepixThis has been updated with today’s version 2.507, thanks for the suggestion.
Halle WinklerPolitepixThe version of this bundle on the site now has its license info as part of the bundle, thanks for bringing this to my attention.
January 5, 2018 at 11:16 am in reply to: Can utterances only bring back what is in dictionary? #1032191Halle WinklerPolitepixThanks for the logging. This is a bit unusual in my experience so I’m trying to pin down whether there are any contributing factors, pardon my questions. How close is your implementation to the sample app which ships with the distribution? Do you get the same results when just altering the sample app to support this grammar? Is there anything about the environment (or I guess even the speaker) which could contribute to the results here?
January 3, 2018 at 4:11 pm in reply to: Can utterances only bring back what is in dictionary? #1032187Halle WinklerPolitepixWelcome,
Like for the post before yours, this would need logging output to be able to help with – please take a look at the link I provided in my previous response, thanks.
Halle WinklerPolitepixIt looks like the problem is that you’re generating the dynamic model with the Chinese acoustic model but you’re starting speech recognition with the English one, so just review whether you have replaced the English model with the Chinese model in all your code, or if you have overlooked one place.
Halle WinklerPolitepixWelcome,
Please check out the post Please read before you post – how to troubleshoot and provide logging info here so you can see how to turn on and share the logging that provides troubleshooting information for this kind of issue.
Halle WinklerPolitepixOK, glad to help!
Halle WinklerPolitepixHi,
I mean it should perform speech recognition by using the phone’s mic when headphones aren’t plugged in, so if that isn’t working in some way, let me know a little more about what you are encountering.
Halle WinklerPolitepixWelcome,
That is the default behavior, so if you’re seeing an issue with it please let me know.
Halle WinklerPolitepixNope, that was a pure accident – I didn’t see their project at the time I named this one and once I was aware of it, IIRC it was commonly going as OpenAIR or OpenART (you can see that in the URL you posted). It is a project for emotion recognition over multiple inputs rather than speech via voice. I haven’t really had the direct experience of parties being confused between the two projects in practical terms – about once a year an emotion recognition question gets posted here and that’s it.
Halle WinklerPolitepixWelcome,
Sorry, it isn’t possible to discuss future plans, but I always recommend making purchase decisions only on the basis of whether the current features match your needs. Thanks for considering Politepix!
Halle WinklerPolitepixHi Coeur,
This is hub4wsj_sc_8k which used to ship with CMU Sphinx under their license covering all of sphinxbase, but in fact I thought that it had a copy of this license within, so I appreciate the heads-up. I will add a ticket for clarifying/fixing that.
Halle WinklerPolitepixSuper, that seems like an ideal solution.
Halle WinklerPolitepixI hear you and I don’t find your request unreasonable. The reason it doesn’t do this is because, in order to support a lot more languages than English and Spanish, it was necessary to take this fairly big step of designing a g2p approach and packaging system that could be used at near-realtime speeds by a phone CPU, and that meant that I couldn’t really continue with the roll-your-own approach to acoustic models that was previously possible because I’m making those g2p files myself with internal tooling. English is an exception because it uses a different g2p method than all the other languages, but I have only had bad experiences with providing support for homemade acoustic models which use the new packaging system. It’s a tradeoff; the (very big) upside to moving away from using the exact Sphinx acoustic model approach is that other languages can now be used with dynamic models and fallback g2p systems and they are performant.
So, in short, the names that are accepted in that method are the names of the models that can be found here which already have the right packaging and whose characteristics I know about, and which I am willing to support. I can’t support other models and I would be hesitant to create a “bring your own model” API because I’m the one who is going to have to play 20 questions when some developers have weird results from making their own models but don’t lead with that information. My recommendation/request in order to keep things manageable for both of us is to either rename your models for the period of your experimentation so that they are allowed through the method, or make your method change locally until you’re satisfied with the results of your experiments. I hope this explanation is helpful to understanding why that method is unexpectedly picky.
Halle WinklerPolitepixHi Coeur,
Sure, that seems like a good request for the next version, I’ll drop it in the tracker.
August 13, 2017 at 11:29 am in reply to: Can utterances only bring back what is in dictionary? #1032035Halle WinklerPolitepixSorry, as I said, that is an unexpected result and I don’t have further suggestions for it. Take a look at the post Please read before you post – how to troubleshoot and provide logging info here so you can see the info needed in order to request any more in-depth troubleshooting.
August 13, 2017 at 9:31 am in reply to: Can utterances only bring back what is in dictionary? #1032033Halle WinklerPolitepixThat is unexpected, but I’m afraid I don’t have more suggestions.
Halle WinklerPolitepixHi Joe,
Please take a look in the OEPocketsphinxController docs (and search the forums) to learn more about Bluetooth support – it is experimental and best-effort, but there are a few methods available to help you support devices you’re interested in.
August 12, 2017 at 6:36 pm in reply to: Can utterances only bring back what is in dictionary? #1032029Halle WinklerPolitepixOK, I would get rid of the one-syllable single-word entries and see if it improves.
August 12, 2017 at 6:12 pm in reply to: Can utterances only bring back what is in dictionary? #1032027Halle WinklerPolitepixHi,
Only the part at the end of your previous post beginning with ThisWillBeSaidOnce is a grammar; the other is still a language model.
August 12, 2017 at 10:40 am in reply to: Can utterances only bring back what is in dictionary? #1032025Halle WinklerPolitepixOK, that’s surprising, but vadThreshold would be the available way to address this. If the utterances you are using in the grammar are particularly short, you may wish to make them a bit longer so they are more distinct from each other and less easily substituted for other utterances.
August 11, 2017 at 8:38 pm in reply to: Can utterances only bring back what is in dictionary? #1032023Halle WinklerPolitepixHi,
Make sure that the vadThreshold is set correctly.
Halle WinklerPolitepixYou’re welcome – I’m glad to hear that it was something like that and that it’s fixed for you now.
Halle WinklerPolitepixCan you both share the contents of your app target (not project) build settings entries “Framework Search Paths” and “Objective-C Bridging Header”, after checking that they seem possible/correct for your project? You can see examples of what they should look like in the sample app.
Halle WinklerPolitepixHi,
These are general Xcode code signing errors because you are building an app for an SDK that requires code signing, but the app hasn’t yet been set to use your code signing identity. If you navigate to the General information pane of the app target you will see various code signing settings that would need to be configured to match your own developer identity or team identity which will let you then build an app when set.
August 8, 2017 at 5:47 pm in reply to: Can utterances only bring back what is in dictionary? #1032014Halle WinklerPolitepixNo, Rejecto works with language models only. I would first start with the stock OpenEars grammar methods and then check out RuleORama if you need to use that grammar approach in realtime.
August 8, 2017 at 3:15 pm in reply to: Can utterances only bring back what is in dictionary? #1032012Halle WinklerPolitepixWelcome,
Yes, take a look in the docs for information about grammars in OpenEars (versus language models, which you are using above) and after looking into that and trying it out, you may possible also want to investigate the use of RuleORama in case you need it in realtime.
Halle WinklerPolitepixHi,
Sorry, this is a known limitation of the shipped acoustic model.
Halle WinklerPolitepixHi,
It will take a little bit of time for me to check into this, thanks for your patience. The sample app is made via the same approach as the tutorial, so to attempt some self-guided troubleshooting in the meantime you could compare the two projects and see what the difference is.
Halle WinklerPolitepixWelcome,
This is the first I’ve heard of it – you’d need to tell me a little bit about which Xcode you’re using, on which OS version, etc, and I can look into it. This is good information to share for any question that is about build issues.
Halle WinklerPolitepixWhen I’ve had problems like these (things I wanted to fix by hand which were too many and too distributed across the language), this is how I’ve handled it. 1) I’ve searched for some canonical list of $WORDS, where in this case they are the list of words pronounced differently in US and UK English at the word level, and 2) got a list of the 5,000 most-used words in the language overall, and 3) taken the intersection of these two lists. At that point you may have a short enough list, but relevant enough, to make it not too terrible of a job to change them manually. If it’s still too much you can reduce 5,000 to something smaller, or vice versa if you discover it isn’t as many common words as you thought.
Halle WinklerPolitepixMy strong suspicion is that that table is for converting a voice that uses US phonemes to sound like received pronunciation, because that could be done tolerably by a table (e.g. “er” at the end of a word always sounds like a US “ah”), while converting words which actually have different pronunciations would have to be a long list of exceptional cases, including different accented syllables.
Halle WinklerPolitepixHi,
There are two issues – one is how the phonemes are said (this should be correctly handled by the UK voices) and the other is which phonemes the local pronunciation contains and/or are accented (this can be quite different, for instance in the words aluminum or garage). The CMU dictionary is a US speech dictionary, so as far as I know there is no version of it which will preference UK pronunciations over US ones. It sounds to me like your issue is with the latter case, is that correct?
Halle WinklerPolitepixHi Mateo,
Sorry, there are no exposed APIs for the developer to work with the lookup list specifically, but all of the APIs which work with an acoustic model take an arbitrary path to it, so you can point this elsewhere. I unfortunately can’t offer support for using different acoustic models or altering them because that becomes a very broad topic, sorry I can’t help more!
Halle WinklerPolitepixHi Mateo,
Although I don’t give support for this, you can add pronunciations to the English lookup list (but not any other languages) by editing the file in the bundle, if you are very careful to use working phonemes, match the formatting of other entries, and keep everything alphabetical. You can’t make changes to that functionality at runtime. There is no special reason that adding fake pronunciations wouldn’t work, but it makes more sense in terms of maintainability to do that via substitution at the time of detection (i.e. detect “bike” and process it as if it were “bicycle”).
June 2, 2017 at 9:01 am in reply to: AcousticModelChinese Bundle is too large for my project #1031872Halle WinklerPolitepixWelcome,
Sorry, nothing can be removed. Maybe it would be helpful to think about it as (or present it to another team member who is concerned about size as) being the size of 5 photos on a current phone.
Halle WinklerPolitepixWelcome,
Check out the FAQ for help with this and other similar issues: https://www.politepix.com/openears/support
Halle WinklerPolitepixHi,
I would need the audio and the unaltered code settings with which the audio was recorded (with my request that VAD is set to the maximum limit where it still can perceive the trigger word when Rejecto is off), according to the instructions in this post:
https://www.politepix.com/forums/topic/how-to-create-a-minimal-case-for-replication/
In your case I would also need to know the distance from the iPad mic to the human speaker and to the music source.
As mentioned, this is not necessarily something where the result is going to be the same behavior between those two implementations because they are not using the same API methods in Sphinx, but I don’t mind taking a look and seeing if there is something to suggest.
Halle WinklerPolitepixHi,
Sorry, to clarify, the spotting of a single trigger word is not actually an intended feature of OpenEars or described as a goal here or in its documentation – this would use a newer part of the Sphinx project which hasn’t been implemented in OpenEars. It does get used that way and I try to help with this situation when it comes up, but Rejecto was designed with the intention to reject OOV for vocabularies with multiple words. Pocketsphinx uses its own keyword spotting API so it isn’t an unexpected result that the outcomes are different. This may be a case in which you’d prefer to simply use Pocketsphinx built for an iOS target, which is supported by that project to the best of my knowledge.
Regardless, I’m happy to continue to try to help you get better results. It isn’t clear to me from your response whether you took my first requested step of not using Rejecto while troubleshooting the VAD. It isn’t an expected result that a word found in your vocabulary that is significantly louder than the background isn’t heard at any VAD setting when Rejecto isn’t on. Is it possible that you’re using a different acoustic model with the pi version?
Halle WinklerPolitepixOK, I recommend temporarily removing Rejecto, turning the vadThreshold up to 4.5 and reducing it by increments of .1 until you find the highest value which perceives your word and doesn’t react to the music. Once this is established, re-add Rejecto to reject OOV human speech.
Halle WinklerPolitepixHello,
Yes, I have heard of a similar quiet-noise issue with iPads before with the version of the pocketsphinx VAD used in OpenEars. Please don’t use setRemovingNoise/setRemovingSilence in this case. Which language is this with, and please share your Rejecto settings.
Halle WinklerPolitepixHi, please read the post Please read before you post – how to troubleshoot and provide logging info here in order to be able to receive further assistance in these forums, thanks. I will close this and the other topic because they lack logging output, but you can feel free to open a new topic starting with posting your complete logging output in order to get assistance with debugging.
Halle WinklerPolitepixWelcome,
Sorry, this is regretfully not a question that is possible for a party outside of your organization to estimate, even approximately.
Best,
Halle
Halle WinklerPolitepixOK, feel free to give me the requested debug info if you’d like my debugging assistance.
Halle WinklerPolitepixOK, please also show me your code modifications from the sample app as well.
Halle WinklerPolitepixHi,
There’s no known-to-me reason for that, but it’s likely something relating to the app code (most likely overriding audio sessions needed by OpenEars). Please check out the post Please read before you post – how to troubleshoot and provide logging info here so you can see how to turn on and share the logging that provides troubleshooting information for this kind of issue, thanks. As a troubleshooting measure you can undertake without my help, I recommend adding SaveThatWave to the OpenEars sample app and seeing if you get a different result – if so, something in the app is interfering with the audio session settings needed by OpenEars.
Halle WinklerPolitepixHi,
I wasn’t recommending that you set them all, just recommending that since this isn’t a supported configuration, you give the header a read-through and try and see which settings may help you (I would expect that the most-relevant properties to your question are regarding session mixing and disabling sample rate setting). Sorry, there is no recipe for this situation because it isn’t supported by this project – there are some existing hooks to give you some influence over the behavior of the audio session and it is necessary to do self-directed investigation into whether they help with your issue.
Halle WinklerPolitepixWelcome,
The difficulty of supporting issues like this with different configurations is the reason that bluetooth support is considered experimental, so support for this kind of question is very limited as a consequence, however, please check out the OEPocketsphinxController property disablePreferredSampleRate (and related audio session properties) to see if they improve results for you, making sure that the OEPocketsphinxController instance is active before attempting to set its properties.
Halle WinklerPolitepixHello,
If it isn’t in the acoustic model dictionary a pronunciation will still always be generated.
Halle WinklerPolitepixWelcome,
Please check out the post Please read before you post – how to troubleshoot and provide logging info here so you can see how to turn on and share the logging that provides troubleshooting information for this kind of issue, thanks.
Halle WinklerPolitepixHi Alex,
No, there is no current plan to do that, sorry.
Halle WinklerPolitepixGood troubleshooting! Glad you found the issue.
Halle WinklerPolitepixThanks. Have you taken any steps to verify the existence of your files at runtime, e.g. https://developer.apple.com/reference/foundation/filemanager/1410277-fileexists ? Getting that error means that a fileExists: check has failed for OpenEars, so I think the best line of investigation for you is whether the steps taken to make your lm available at runtime were successful. Sometimes this can be as simple as the path just being off by one directory level or something being added to an app target but not a test target or vice versa. This could also maybe be related to permissions for the file, but that seems less likely, so I would thoroughly investigate the easy stuff first.
Halle WinklerPolitepixOK, then please show your complete OELogging and verbosePocketsphinx logs for the entire app session – you can read more about it this here: https://www.politepix.com/forums/topic/install-issues-and-their-solutions/
Halle WinklerPolitepixHi,
Bitcode isn’t currently supported, you can turn it off in the build settings. There are a few threads about this here.
Please search for some of the troubleshooting threads about linker errors for plugins, this has been diagnosed a few times here. Usually a step was missed from the instructions or tutorial, or a step was taken which isn’t in the OpenEars docs and isn’t helpful, from an external source like Stack Overflow. It isn’t related to order of headers.
Halle WinklerPolitepixOK, can you check out what is different about the sample app from your app that results in no OELogging output for you? You ought to get the same results up to the point of error, starting with something like “Starting OpenEars logging for OpenEars version 2.504 on 64-bit device (or build): iPhone running iOS version: 10.300000”
-
AuthorPosts