I do this as a diagnostic technique.
* Build a dictionary with 40 words, each word being just one of the CMU phonemes
* Build a language model or FSG where each of the words can follow any other word
Fair warning, the results will be very strange.
tidigits is a very “clean” grammar, it works well with fluent speech. And, like most 8k models, it’s most effective on adult males. Kids work best with 16k models (so do women). I’d suggest switching over to VoxForge 0.4, from the CMU site. To make this work well, though, you really need models built from kid speech.
You could try model adaptation.