Yes, sorry, I would say that this is expected behavior for a vocabulary so large that there are several perfect rhyming matches for OOV (out of vocabulary) utterances in the vocabulary, when the utterances are being spoken directly into the device at the same level as the rest of the interaction. If it were possible for Rejecto to be so highly weighted that it would override a large vocabulary’s ability to match these utterances, there would be too much confusion to confidently identify in-vocabulary utterances as well. This is actually the same with something like Siri – if you insert an OOV word directly into an otherwise-contextually-understandable Siri interaction (e.g. a name Siri doesn’t know, a foreign word, or something like medical jargon which isn’t in the relevant vocab) it will either be transcribed into something else or the question will be thrown out.
It has only been in the last couple of device versions that a 1000-word vocabulary could be reasonably made use of in OpenEars, so Rejecto is more geared towards use with a smaller vocabulary which has some “holes” in it in terms of not having multiple rhyming matches for the majority of OOV utterances.
Is this a realistic interaction for your app, that there is clear speech directed at it intentionally under ideal circumstances but the speech is unknown to the app? Generally OOV rejection is more optimized for tuning out speech which isn’t intended for the app, ruling out non-speech sounds rather than submitting them for hypotheses, and otherwise avoiding input of a more accidental nature. Users directing a lot of clear intentional speech towards the app that the app doesn’t know about might more come under the category of user education requirements than something needing a technological fix – it depends a bit on how you see this occurring.
Regarding the very large vocabulary, do you know that you can switch between smaller vocabularies almost instantly? So if your requirement is to present a sentence the user is supposed to say correctly, you can just attempt to recognize those words plus the Rejecto phonemes rather than the entire vocabulary at once.