OK, thanks for the info. The silence detection is correct, that just means that it detected enough of a pause to conclude the utterance and start recognition and is normal/required.

b) I have noticed that every time the background noise changed (music on/off) my first three commands went to waste and after that it started recognising. This happens even when I turn the music off so there in no background noise.

This might just be the amount of time needed in order to recover from an abrupt noise level change. The voice audio detection needs a bit of time to adapt and it doesn’t make abrupt changes because that would mean that any idiosyncratic buffer (like a truck driving by) would have a dramatic effect on activity detection.

I also had a crash after two minutes of inactivity.

This is a big deal and I’d appreciate any info you can give me, with OpenEarsLogging and verbosePocketsphinx turned on and then typing in “bt” when you see the blue lldb prompt in the console and showing me the output of that backtrace as well as the complete logging (starting at the beginning) of what OpenEars outputs.