So, you are saying that we should no longer use suspend or resume?
Not at all – please keep using suspend and resume. As far as I know, the recognition issue reported was occurring when using suspend immediately after starting listening as a way of “priming” or avoiding calibration time to give the user the impression of an instant start, which is always going to be bad for a voice activity detection system that is either calibrating or using ongoing sampling of noise levels since it is starting and then immediately removing the data that is needed to get the starting speech/silence threshold, then later suddenly performing recognition on speech that is separated from the original input by some time gap. I’ve always recommended against doing this as far as I can recall, but now it doesn’t even have the upside of giving the impression of a faster start because the startup time no longer includes a calibration period.
It’s 100% fine to use suspend and resume intermittently for its intended purpose of suspending and resuming already-started recognition to prevent recognition being performed when it isn’t desired, such as during audio playback or TTS output. For a potentially very long period of suspension that isn’t intermittent (the user is entering some part of the interface where they might be working for a while without any need for a speech UI, for instance) you may see better VAD results with fully stopping and then starting later on demand, and I think the startup timeframe is short enough that that isn’t onerous as a UX. Whether this is advantageous should become clear with a little bit of testing.