That is interesting and it sounds like you’ve tested carefully and approached the scoring in a considered way. My main recommendation for excluding false positives is by the adjustment of vadThreshold combined with Rejecto and Rejecto’s weight argument specifically rather than scores, but it certainly doesn’t sound like your approach was found in an unconsidered way.
I would recommend retesting a bit and seeing if you still need scores, especially if you use Rejecto, because there have been improvements in several updates starting with 2.0 and Rejecto should be notably improved in 2.041. Based on all experience over the last 5 years of OpenEars, I do think it is likely that there are classes of users, classes of devices, classes of mics, and classes of operating distances leading to frustration with a -2000 to -800 cutoff (as I mentioned, that would put 100% of my speech squarely in the ambiguous category when I am testing under ideal conditions, and I’m a clear speaker with a Northeast US accent, but being female gives a large score reduction even when recognition remains excellent just because the training corpus is biased towards male speakers). It is challenging to construct tests which take all of the factors which affect scoring into account.
If you retest and decide to keep your scoring but adjust the scale, I recommend keeping your old score cutoffs around just in case I fix this scoring change in an update soon. If a lot of time goes by without an obvious fix I am going to let it go, but if there is a good way to adjust it back to the previous scale without affecting the positive improvements that I can find in the next 12 weeks or so, I will release an update restoring the old scoring scale.