题名

虛無假設顯著性考驗的演進、議題與迷思

并列篇名

Null Hypothesis Significance Testing: Its Evolution, Current Use, Misuse, and Misconceptions

DOI

10.6773/JRMS.201006.0001

作者

李茂能(Mao-Neng Fred Li)

关键词

虛無假設 ; 顯著性考驗 ; 效果值分析 ; 統計考驗力 ; 樣本規劃 ; Null hypothesis ; significance testing ; effect size analysis ; power ; sample size

期刊名称

測驗統計年刊

卷期/出版年月

18期_上(2010 / 06 / 01)

页次

1 - 22

内容语文

繁體中文

中文摘要

本文旨在釐清虛無假設顯著性考驗過程中相關的議題與迷思,並凸顯p值與效果值的同等重要性。在相關議題方面,文中論及顯著性考驗之兩大派別與脈絡、點與區間虛無假設考驗、α值與p值的定義、為何需事先訂出α值、為何不要用「*」表示顯著水準、效果值分析與顯著性考驗是否同等重要、研究樣本到底要多大、及研究結果可複製性等議題。在迷思方面,主在探究α、p值的迷思、p值與實得統計考驗力之迷思、第一與第二類型錯誤之迷思。為避免顯著性考驗之誤用,文末提議採用兩套關卡進行統計的顯著性考驗,其中第一道關卡是p值的分析,第二道關卡是效果值的分析。量化研究的品質透過此兩套關卡才能確保,其研究結論方能更具運用價值。

英文摘要

This paper discusses some uses, misuses, and misconceptions of null hypothesis significance testing (NHST) in current social science research. First, the historical development of the Fisher test of significance and the Neyman-Pearson hypothesis testing are briefly described. Second, several critical issues related to NHST are discussed (e.g., α and p-values、point and range estimation). Third, several frequently asked questions about NHST are answered: Why should α be declared before data are collected? Why not use「*」to indicate a significant result? Are effect size analysis and statistical significance testing equally important? What is a sufficient sample size? Fourth, several misinterpretations of NHST are also discussed (e.g., α vs. p-value、p-value vs. statistical power, Type-Ⅰ vs. Type-Ⅱ error rates). Finally, to avoid misuse of NHST and ensure more practical or clinical significance, a two-step procedure (p-value analysis and effect size analysis) is proposed for evaluating a hypothesis and quality control.

主题分类 基礎與應用科學 > 統計
社會科學 > 教育學
参考文献
  1. 柳敦仁(2006)。國立嘉義大學國民教育研究所。
    連結:
  2. Carstensen, B. (2002). Post-hoc power: Don't do it. Retrieved May 1, 2006 from the World Wide Web:http://www.biostat.ku.dk/~bxc.
  3. 翁秉仁(2000)。Fisher, Ronald Aylmer。Retrieved June 1, 2006 from the World Wide Web: http://episte.math.ntu.edu.tw/people/p_fisher
  4. Thompson, B. (2001). 402 Citations questioning the indiscriminate use of null hypothesis significance tests in observational studies. Retrieved July 1, 2006 from the World Wide Web: http://www.warnercnr.colostate.edu/~anderson/thompson1.html
  5. McLean, A. (2001). On the nature and role of hypothesis tests. Retrieved July 1, 2006 from the World Wide Web: http://Alan.mclean@buseco.monash.edu.au
  6. Yu, C. H. (2006). Don't believe in the null hypothesis? Retrieved Dec. 1, 2006 from the World Wide Web: http://seamonkey.ed.asu.edu/~alex/computer/sas/hypothesis.html
  7. Gill, J. (2007). How do we do hypothesis testing? Retrieved July 4, 2007 from the World Wide Web: http://artsci.wustl.edu/~jgill/papers/hypos.pdf
  8. Cohen, J.(1988).Statistical power analysis for the behavioral sciences (2nd ed.).Hillside, NJ:Erlbaum.
  9. Cohen, J.(1992).A power primer.Psychological Bulletin,112(1),155-159.
  10. Denis, D.(2001).Inferring the Alternative Hypothesis: Risky Business.Theory & Science
  11. Faul, F.,Erdfelder, E.,Lang, A-G.,Buchner, A.(2008).G*Power 3: A flexiblestatistical power analysis program for the social, behavioral, and biomedical sciences.Behavior Research Methods,39(2),175-191.
  12. Gigerenzer, G.(2004).Mindless statistics.The Journal of Socio-Economics,33,587-606.
  13. Gliner, J. A.,Leech, N. L.,Morgan, G. A.(2002).Problems with null hypothesis significance testing (NHST): What do the textbooks say?.The Journal of Experimental Education,71(1),83-92.
  14. Hubbard, R.,Armstrong, J. S.(2006).Why we don't really know what statistical significance means: Implications for educators.Journal of Marketing Education,28(2),114-120.
  15. Hubbard, R.,Bayarri, M. J.(2003).Confusion over measures of evidence(p's) versus errors (alphas) in classical statistical testing.American Statistician,57,171-178.
  16. Kline, R. B.(2004).Beyond significance testing: Reforming data analysis methods in behavioral research.Washington, DC:American Psychological Association.
  17. Lenth, R.(2001).Some practical guidelines for effective sample-size determination.The American Statistician,55,187-193.
  18. Linacre, J. M.(2010).Which hypothesis is the null hypothesis?.Rasch Measurement Transactions,23(4),242.
  19. Lipsey, M. W.(1990).Design sensitivity: Statistical power for experimental research.Newbury Park:Sage.
  20. McLean, J. E.,Ernest, J. M.(1998).The role of statistical significance testing in educational research.Research in the Schools,5(2),15-22.
  21. Salsburg, D.(2001).The lady tasting tea: How statistics revolutionized science in the twentieth century.New York:Freeman & Company.
  22. Steiger, J. H.(2004).Beyond the F test: Effect size confidence intervals and tests of close fit in the analysis of variance and contrast analysis.Psychological Methods,9,163-182.
  23. Sterne, J. A. C.(2002).Teaching hypothesis tests-time for significant change?.Statistics in Medicine,21,985-994.
  24. 朱國聖(2004)。國立嘉義大學國民教育研究所。
  25. 李茂能(1998)。統計顯著性考驗的再省思。教育研究資訊,6(3),103-115。
  26. 李茂能(2002)。量化研究的品管:統計考驗力與效果值分析。國民教育研究學報,8,1-24。
被引用次数
  1. 許育齡(2016)。運用人格、心理與環境因素預測教師教學設計想像力。教育科學研究期刊,61(3),69-98。
  2. 黃啟梧、莊妙仙(2013)。視覺故事感特徵與讀者感受之探討。藝術教育研究,25,77-103。
  3. 張自立、林慧敏、辛懷梓、王國華(2010)。環境安全與衛生通識課程對學生在認知及態度上之影響-以一所國立教育大學為例。臺北市立教育大學學報:教育類,41(2),29-58。
  4. (2014)。中文成語理解的雙翼:語境與可分析性。教育與心理研究,37(1),95-121。