Hmm. If you want to do this with RapidEars, I guess the following is possible:
1. Suspend immediately once listening has started. The user “start” interaction will cause recognition to resume. This has the same effect as your short-circuiting idea but uses the API.
2. The user “stop” interaction causes buffers of prerecorded non-speech (it has to be real low-noise-level quiet non-speech recording and not just zeroes, which the VAD will correctly ignore) to be written over the rendered callback buffer for a number of callbacks equal to .7 seconds (that’s about 6 callbacks). This should result in a natural exit from the silence detection loop as the silence makes its way into the VAD. This delay shouldn’t be a big deal for you since live recognition will already have been in progress of being performed and displayed as soon as the mic stream began, so this is just a formality to get the final hypothesis.
This should work to start and stop in both PocketsphinxController and PocketsphinxController+RapidEars.