Bioinformatics

Estimating the proportion of true null hypotheses when the statistics are discrete

Dialsingh, I., Austin, S. R., Altman, N. S..

Motivation: In high-dimensional testing problems 0, the proportion of null hypotheses that are true is an important parameter. For discrete test statistics, the P values come from a discrete distribution with finite support and the null distribution may depend on an ancillary statistic such as a table margin that varies among the test statistics. Methods for estimating 0 developed for continuous test statistics, which depend on a uniform or identical null distribution of P values, may not perform well when applied to discrete testing problems.

Results: This article introduces a number of 0 estimators, the regression and ‘T’ methods that perform well with discrete test statistics and also assesses how well methods developed for or adapted from continuous tests perform with discrete tests. We demonstrate the usefulness of these estimators in the analysis of high-throughput biological RNA-seq and single-nucleotide polymorphism data.

Availability and implementation: implemented in R

Contact: nsa1@psu.edu or naomi@psu.edu

Supplementary information: Supplementary data are available at Bioinformatics online.