题名

A Robust Estimation of the Proportion of True Null Hypotheses Based on a Beta Mixture Model

DOI

10.29973/JCSA.201106.0001

作者

Chun-Chao Wang;Yu-Hsing Lin;Yi-Ting Hwang

关键词

Microarray gene data ; beta mixture model ; gene expressions ; multiplicity testing

期刊名称

中國統計學報

卷期/出版年月

49卷2期(2011 / 06 / 01)

页次

43 - 59

内容语文

英文

英文摘要

Microarrays allow investigators to assess the expression of thousands of genes and identify those with differential levels across groups. Benjamini and Hochberg (2000) proposed an adaptive false discovery rate (FDR) controlling procedure that has the ability to control the overall type I error rate. However, their method requires information about the proportion of true null hypotheses (π0), which is unknown and has to be estimated. In this study, we propose a robust method to estimate π0 based on the beta mixture model developed by Allison et al. (2002) for the distribution of the p-values from the multiplicity testing. A Monte Carlo study shows that our method outperforms the original approach by Allison et al. (2002) when the gene expressions are correlated. We also compare the results with the estimation method proposed by Benjamini and Hochberg (2000). A case study is used to illustrate the feasibility of the proposed method.

主题分类 基礎與應用科學 > 統計
参考文献
  1. Allison, D. B.,Gadbury, G. L.,Moonseong, H.(2002).A Mixture model approach for the analysis of microarray gene expression data.Computational Statistics and Data Analysis,39,1-20.
  2. Benjamini, Y.,Hochberg, Y.(1995).Controlling the false discovery rate: A practical and powerful approach to multiple testing.Journal of the Royal Statistical Society: Series B,57,289-300.
  3. Benjamini, Y.,Hochberg, Y.(2000).On the adaptive control of the false discovery rate in multiple testing with independent statistics.Journal of Educational and Behavioral Statistics,25,60-83.
  4. Golub, T. R.,Slonim, D. K.,Tamayo, P.,Huard, C.,Gasenbeek, M.,Mesirov, J. P.,Coller, H.,Loh, M. L.,Downing, J. R.,Caligiuri, M. A.,Bloomfield, C. D.,Lander, E. S.(1999).Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring.Science,286,531-537.
  5. Hwang, Y. T.(2011).Comparisons of estimators of the number of true null hypotheses and adaptive FDR procedures in multiplicity testing.Journal of Statistical Computation and Simulation,81(2),207-220.
  6. Lagarias, J. C.,Reeds, J. A.,Wright, M. H.,Wright, P. E.(1998).Convergence Properties of the Nelder-Mead Simplex Method in Low Dimensions.SIAM Journal of Optimization,9(1),112-147.
  7. Lin, Y. H.(2010).National Taipei University.
  8. Parker, R. A.,Rothenberg, R. B.(1998).Identifying important results from multiple statistical tests.Statistics in Medicine,7,1031-1043.
  9. Schweder, T.,Spjøtvoll, E.(1982).Plots of p-values to evaluate many tests simultaneously.Biometrika,69,493-502.