题名

Mokken量尺分析的實務運用:以「英語閱讀能力測驗」與「英語口說情緒反應量表」為例

并列篇名

Mokken Scale Analysis for Educational and Psychological Testing: Two Applications

作者

洪素蘋(Su-Pin Hung);許清芳(Ching-Fan Sheu);李炯方(Joseph Lavallee)

关键词

Mokken量尺分析 ; 非參數試題反應理論 ; 單調同質模型 ; 雙重單調同質模型 ; double homogeneity model ; Mokken scale analysis ; monotone homogeneity model ; nonparametric item response theory

期刊名称

測驗學刊

卷期/出版年月

65卷2期(2018 / 06 / 01)

页次

181 - 215

内容语文

繁體中文

中文摘要

Mokken量尺分析之目的在建立符合非參數單調同質模型或雙重單調同質模型的單一潛在量尺,其程序包含篩選試題的規準與逐一檢核試題符合模型假定的方法。本研究旨在應用此非參數試題反應理論,分別對2,270位大一新生的「英語閱讀能力測驗」與168位國中學生的「英語口說情緒反應量表」進行分析。結果在10題二元計分的閱讀能力測驗分析發現,共有5題符合雙重單調模型假設;在20題多元計分的口說情緒反應題目檢驗發現,僅有2題違背單調同質性模型。兩份資料的題目反應都符應單一潛在構念向度的假定。本研究針對Mokken量尺分析的步驟與方法、分析軟體取得、分析程式語法、視覺化診斷、題目反應與非參數模型間適配情況進行討論、提供資料詮釋,以及實務應用上的建議。

英文摘要

Mokken scale analysis is a scaling procedure for constructing unidimensional test items. It offers an item selection algorithm to partition a set of items into Mokken scales. It further provides methods to check the assumptions of two nonparametric item response theory (NIRT) models: the monotone homogeneity model (MHM) and the double homogeneity model (DHM). The present study aims to demonstrate applications of NIRT with empirical data. Scalability of data from part of a reading proficiency test for first-year university students (n = 2,270) and from an affective reactions to speaking questionnaire for high school students (n = 168) were examined to illustrate Mokken scale analysis with dichotomous and polytomous items, respectively. It was found that five items on the ten-item reading proficiency test were scalable according to the DHM. On the anxiety questionnaire, only two of twenty items were found to violate the unidimensionality assumption. Our paper also demonstrates use of the open source software program R for unidimensionality and reliability analyses and for graphical diagnostics with the MHM and DHM. We conclude with comments on the process of interpreting results as well as on the practical implications of NIRT modeling for educational and psychological testing.

主题分类 社會科學 > 心理學
社會科學 > 教育學
参考文献
  1. 巫博瀚、賴英娟、施慶麟(2013)。「Rosenberg 自尊量表」之試題衡鑑:評等量尺模型的應用。測驗學刊,60(2),263-289。
    連結:
  2. 侯雅齡(2010)。「國民中學自然科學性向測驗」之編製。測驗學刊,57(1),29-58。
    連結:
  3. 莊媖纓(2010)。技職院校大學生於英文課堂上之英語口說情緒反應之研究。國立虎尾科技大學學報,29(1),75-94。
    連結:
  4. 黃宏宇、洪素蘋(2007)。中文網路成癮量表之等級反應模式分析。測驗學刊,54(2),331-353。
    連結:
  5. Andrich, D.(1978).A rating formulation for ordered response categories.Psychometrika,43,561-573.
  6. Chou, Y.-H.,Lee, C.-P.,Liu, C.-Y.,Hung, C.-I.(2017).Construct validity of the Depression and Somatic Symptoms Scale: Evaluation by Mokken scale analysis.Neuropsychiatric Disease and Treatment,13,205-211.
  7. Cronbach, L.(1951).Coefficient alpha and the internal structure of tests.Psychometrika,16,297-334.
  8. Embretson, S. E.,Reise, S.(2000).Item response theory for psychologists.Mahwah, NJ:Lawrence Erlbaum Associates.
  9. Guion, R. M.,Ironson, G. H.(1983).Latent trait for organizational research.Organizational Behavior and Performance,31,54-87.
  10. Guttman, L. A. (1944). A basis for scaling qualitative data. American Sociological Review, 91, 139-150.
  11. Guttman, L. A. (1945). A basis for analyzing test-retest reliability. Psychometrika, 10, 255-282.
  12. Horwitz, E. K.,Horwitz, M. B.,Cope, J. A.(1986).Foreign language classroom anxiety.The Modern Language Journal,70(2),125-132.
  13. Junker, B. W.,Sijtsma, K.(2001).Nonparametric item response theory in action: An overview of the special issue.Applied Psychological Measurement,25(3),211-220.
  14. Junker, B.W.,Sijtsma, K.(2001).Cognitive assessment models with few assumptions, and connections with nonparametric item response theory.Applied Psychological Measurement,25(3),258-272.
  15. Kempf-Leonard, K.(Ed.)(2005).Encyclopedia of social measurement.NY:Elsevier.
  16. Linden, W. J. v. d.(Ed.),Hambleton, R. K.(Ed.)(1997).Handbook of modern item response theory.New York, NY:Springer.
  17. Loevinger, J. A. (1948). The technique of homogeneous tests compared with some aspects of scale analysis and factor analysis. Psychological Bulletin, 45, 507-530.
  18. Meijer, R. R.,Sijtsma, K.(2001).Methodology review: Evaluating person fit.Applied Psychological Measurement,25(2),107-135.
  19. Meijer, R. R.,Sijtsma, K.,Smid, N. G.(1990).Theoretical and empirical comparison of the Mokken and the Rasch approach to IRT.Applied Psychological Measurement,14,283-298.
  20. Mokken, R. J.(1971).A theory and procedure of scale analysis with applications in political research.New York, NY:De Gruyter.
  21. Molenaar, I. W.,Sijtsma, K.(2000).User's manual MSP5 for Windows.Groningen:IEC ProGAMMA.
  22. Muncer, S. J.,Speak, B.(2016).Mokken scale analysis and confirmatory factor analysis of the Health of the Nation Outcome Scales.Personality and Individual Differences,94,272-276.
  23. R Core Team. (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Retrieved from http://www.R-project.org/.
  24. Rasch, G.(1980).Probabilistic models for some intelligence and attainment tests.Chicago, IL:The University of Chicago Press.
  25. Sijtsma, K.(1998).Methodology review: Nonparametric IRT approaches to the analysis of dichotomous item scores.Applied Psychological Measurement,20(1),3-31.
  26. Sijtsma, K.,Debets, P.,Molenaar, I. W.(1990).Mokken scale analysis for polychotomous items: Theory, a computer program and an empirical application.Quality & Quantity,24,173-188.
  27. Sijtsma, K.,Emons, W. H. M.,Bouwmeester, S.,Nyklicek, I.,Roorda, L. D.(1998).Nonparametric IRT analysis of Quality-of-Life Scale and its application to the World Health Organization Quality-of-Life Scale (WHOQOL-Brief).Quality Life Research,17,275-290.
  28. Sijtsma, K.,Junker, B.W.(1996).A survey of theory and methods of invariant item ordering.British Journal of Mathematical and Statistical Psychology,49,79-105.
  29. Sijtsma, K.,Meijer, R. R.,van der Ark, L. A.(2011).Mokken scale analysis as time goes by: An update for scaling practitioners.Personality and Individual Differences,50,31-37.
  30. Sijtsma, K.,Molenaar, I. W.(1987).Reliability of test scores in nonparametric item response theory.Psychometrika,52(1),79-97.
  31. Sijtsma, K.,Molenaar, I.W.(2002).Introduction to nonparametric item response theory.Thousand Oaks, CA:Sage.
  32. Sijtsma, K.,van der Ark, L. A.(2017).A tutorial on how to do a Mokken scale analysis on your test and questionaire data.British Journal of Mathematical and Statistical Psychology,70(1),137-258.
  33. Sijtsma, K.,Verweij, A. C.(1992).Mokken scale analysis: Theoretical considerations and an application to transitivity tasks.Applied Measurement in Education,5(4),355-373.
  34. Smith, R.M.,Schumacker, R. E.,Bush, M. J.(1998).Using item mean squares to evaluate fit to the Rasch model.Journal of Outcome Measurement,2,66-78.
  35. Stewart, M. E.,Watson, R.,Clark, A.,Ebmeier, K. P.,Deary, I. J.(2010).A hierarchy of happiness? Mokken scaling analysis of the Oxford Happiness Inventory.Personality and Individual Differences,48,845-848.
  36. Stochl, J.,Jones, P. B.,Croudance, T. J.(2012).Mokken scale analysis of mental health and well-being questionnaire item responses: A non-parametric IRT method in empirical research for applied health researchers.BMC Medical Research Methodology,12(74)
  37. Straat, J. H.,van der Ark, L. A.,Sijtsma, K.(2014).Minimum sample size requirements for Mokken scale analysis.Educational and Psychological Measurement,74(5),809-822.
  38. van Abswoude, A. A. H.,van der Ark, L. A.,Sijtsma, K.(2004).A comparative study on test data dimensionality procedures under nonparametric IRT models.Applied Psychological Measurement,28(1),3-24.
  39. van der Ark, L. A.(2012).New developments in Mokken scale analysis in R.Journal of Statistical Software,48(5),1-27.
  40. van der Ark, L. A.(2007).Mokken scale analysis in R.Journal of Statistical Software,20(11),1-19.
  41. van der Ark, L. A.,van der Palm, D. W.,Sijtsma, K.(2011).A latent class approach to estimating test-score reliability.Applied Psychological Measurement,35,380-392.
  42. van Schuur,W. H.(2003).Mokken scale analysis: Between the Guttman scale and parametric Item response theory.Political Analysis,11,139-163.
  43. Wang, W.-C.,Chen, C.-T.(2005).Item parameters recovery, standard error estimates, and fit statistics of the WINSTEPS program for the family of Rasch models.Educational and Psychological Measurement,65,376-404.
  44. Wright, B. D.(1977).Solving measurement problems with the Rasch model.Journal of Educational Measurement,14(2),97-116.
  45. 李雅珍(2014)。博士論文(博士論文)。臺北市,國立臺灣大學。
  46. 鄭中平、許清芳(2015)。R在行為科學之應用。臺北市:雙葉。