Reply To: Detecting single letters in the alphabet

April 27, 2011 at 12:48 pm #3994

Politepix

OK, I think the first easy step is to get rid of the pronunciations that are in the dictionary that you definitely don’t want to recognize. I realize this isn’t at all self-evident so I’ll explain briefly. If you look at this from the dictionary:

A AH
A(2) EY

That means that the language model tool gave you back two possible pronunciations for the word A. The first one is the particular NA pronunciation of the article “a” as in “a dog barked” that rhymes with “huh”. Since you don’t ever want to recognize that pronunciation of “a” because the alphabet character is never pronounced that way, you should erase that pronunciation from your dictionary.

The (2) in parentheses just means that it is the second pronunciation of the word, so the way you would want to replace

A AH
A(2) EY

is with the line

A EY

deleting the first pronunciation, and removing the (2) from the second pronunciation since it is now the only pronunciation you are going to accept.

The next thing that you can do is to make the sentence “A B” part of your corpus. The corpus can have individual words, but it can also contain combinations of words. Combinations of words that you have made part of your corpus will have an automatically higher probability of being detected.

So, the corpus would say something like this:

A
B
A B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
ONE
TWO
THREE
FOUR
FIVE
SIX
SEVEN
EIGHT
NINE
ZERO

You can do this for all of the possible combinations if you want to, or just the ones where you want to raise their probability of being detected. When you look at the language model that is output, you will see that there is a 2-gram entry for A B and that it has a raised probability.