Corrected Confidence Interval Calculation

While updating YHRD to release R59, we had migrated our Confidence Interval implementation from R package ‘binom’ to an easier and dependency-free Beta Distribution calculation (it is included in R by default). Unfortunately, we introduced a minor bug when evaluating the lower bounds in cases of no matches. Following the correct Clopper-Pearsons method by evaluating the corresponding quantiles from the beta distribution, there are two special cases which cannot be evaluated: (a) When there are no observations (successes, X = 0), and (b) When all haplotypes in the database are matching (all trials are successes, X = n). We have updated our calculations to cover those cases. The exact formulas used at YHRD are taken from Thulin, Måns (2014). “The cost of using exact confidence intervals for a binomial proportion”. Electronic Journal of Statistics. 8 (1): 817–840 formula (4) and subsequent notes.

Additionally, the Confidence Interval calculation of the expected count (n+1)/(N+1) was taken from the actual observed count. This bug has been fixed as well.

(posted over 5 years ago, :auto_ids => false)

* See FAQ/Glossary (http://yhrd.org/pages/faq) for further explanations of abbreviated terms used here