题名

不同測驗難度對精熟標準設定與分數轉換效果之影響

并列篇名

Study of the Influence of Diverse Test Difficulty on Standard Setting and Score Transformation

DOI

10.7108/PT.200706.0001

作者

謝進昌(Jin-Chang Hsieh);余民寧(Min-Ning Yu)

关键词

分數轉換 ; 最大測驗訊息量法 ; 換算古典測驗分數法 ; 精熟測驗 ; 精熟標凖設定 ; mastery test ; maximum test information approach ; score transformation ; standard setting ; transformed classical test scores approach

期刊名称

測驗學刊

卷期/出版年月

54卷1期(2007 / 06 / 01)

页次

1 - 29

内容语文

繁體中文

中文摘要

本研究目的,主要在探討不同測驗難度情境下,相較於傳統固定通過分數,對於援用最大測驗訊息量法於設定精熟標準時,可能受影響的程度。此外,為便於詮釋設定結果,本研究提出換算古典測驗分數法以進行IRT-θ能力值與古典測驗分數問的轉換,並深究於各類型難度測驗中,可能產生的效果。經分析後,研究結果顯示,當發生不同測驗難度時,若援用傳統固定通過分數進行分類時,會因受試者極端偏態之得分分配,致使易發生較差的精熟/未精熟者分類一致性。相較於最大測驗訊息量法,因本身具備能力估計穩定的特性,則能有效減低影響的層面,整體分類百分比一致性仍具一定水平,大致能維持80%以上。而在分數轉換方面,於不同類型測驗下,轉換效果大致可維持於95%分類一致性水準內,若進一步從樣本分配的角度視之,最大測驗訊息量法因具備隨測驗難易,調整通過標準的特性,相互搭配更有助於在極端難度測驗時,有效減低分數轉換時,可能導致之錯誤分類精熟/未精熟者的人數。最後,綜整本研究結果提出幾項建議,供未來研究者與實務工作者參照運用。

英文摘要

The purpose of this study is to investigate what degree of influence may happen in applying the mastery standard deriving from maximum test information approach to classify mastery/non-mastery in contrast to classical fixed passing score under diverse test difficulty condition. In addition, for the convenience of test result interpretation, we propose the transformed classical test scores approach to convert IRT-0 ability to classical test score and try to discover the effect of score transformation in different types of difficult tests. After analysis, we find that we may have extreme skewed score distribution in using classical fixed passing score and produce ill classification of mastery/non-mastery. In contrast, the maximum test information approach may effectively reduce the degree of influence to maintain an acceptable 80% exact classification performance because of its characteristic of stable examinees’ ability calibration. In the aspect of score transformation, the effect may hold in approximately 95% exact classification in different types of difficult tests. Furthermore, from the view of empirical sample distribution, the result indicates that we may effectively lower the number of error classification deriving from score transformation in accompanying using maximum test information approach when extreme test difficulty happens. It is due to the maximum test information approach possesses the characteristic of adjusting mastery standard depending on different test difficulty. Finally, some conclusions and suggestions are proposed for future usage.

主题分类 社會科學 > 心理學
社會科學 > 教育學
参考文献
  1. 謝進昌、余民寧(2005)。以最大測驗訊息量決定通過分數之研究。測驗學刊,52(2),149-176。
    連結:
  2. Angoff, W. H.,R. L. Thorndike (Ed.)(1971).Educational Measurement.Washington, D.C.:American Council on Education.
  3. Berk, R. A.(1996).Standard setting: The next generation (where few psychometricians have gone before!).Applied Measurement in Education,9(3),215-235.
  4. Berk, R. A.(1984).A guide to criterion-referenced test construction.Batimore, MD:The Johns Hopkins University Press.
  5. Birnbaum, A.,F. M. Lord,M. R. Novick(1968).Statistical theories of mental test scores (chapters 17-20).Reading, MA:Addison-Wesley.
  6. Cohen, J. A.(1960).A coefficient of agreement for nominal scales.Educational and Psychological Measurement,20,37-46.
  7. Crocker, L.,Algina, J.(1986).Introduction to classical and modern test theory.NY:CBS College Publishing.
  8. Ebel, R. L.(1972).Essentials of educational measurement.Englewood Cliffs, NJ:Prentice-Hall.
  9. Hambleton, R. K.(1998).Proceedings of achievement levels workshop.Washington, DC:National Assessment Governing Board.
  10. Hambleton, R. K.,Novick, M. R.(1973).Toward an integration of theory and method for criterion-referenced tests.Journal of Educational Measurement,10,159-170.
  11. Hambleton, R. K.,Swaminathan, H.(1985).Item response theory: Principles and application.Boston:Kluwer Nijhoff Publishing.
  12. Hambleton, R. K.,T. B. Gutkin,C. R. Reynolds(1990).The handbook of school psychology.New York:John Wiley & Sons.
  13. Kane, M.(1994).Validating the performance standards associated with passing scores.Review of Educational Research,64(3),425-461.
  14. Kane, M.,G. J. Cizek (Ed.)(2001).Standard setting: Concepts, methods, and perspectives.Mahwah, NJ:Lawrence Erlbaum Associates.
  15. Kingsbury, G. G.,Weiss, D. J.(1983).New horizons in testing: Latent trait test theory and computerized adaptive testing.New York:Academic Press.
  16. Nedelsky, L.(1954).Absolute grading standards for objective tests.Educational and Psychological Measurement,14,3-19.
  17. Novick, M. R.,Lewis, C.,Jackson, P. H.(1973).The estimation of proportions in a groups.Psychometrika,38,19-45.
  18. Reckase, M. D.(1998).Converting boundaries between National Assessment Governing Board performance categories to points on the National Assessment of Educational Progress score scale: The 1996 science NAEP process.Applied Measurement in Education,11,9-21.
  19. Reckase, M. D.,D. J. Weiss (Ed.)(1983).New horizons in testing: Latent trait test theory and computerized adaptive testing.New York:Academic Press.
  20. Subkoviak, M. J.(1988).A practitioner`s guide to computation and interpretation of reliability indices for mastery test.Journal of Educational Measurement,25,47-55.
  21. Swaminathan, H.,Hambleton, R. K.,Algina, J.(1974).Reliability of criterion-referenced tests: A decision theoretic formulation.Journal of Educational Measurement,11,262-267.
  22. Wang, N.(2003).Use of the Rasch model in standard setting: An item mapping method.Journal of Educational Measurement,40(3),231-253.
  23. Weiss, D. J.,Kingsbury, G. G.(1974).Application of computerized adaptive testing to educational problem.Journal of Educational Measurement,21(4),361-375.
  24. Wiberg, M.(2003).An optimal design approach to criterion-referenced computerized testing.Journal of Educational and Behavioral Statistics,28(2),97-110.
  25. Zimowski, M. F.,Muraki, E.,Mislevy, R. J.,Bock, R. D.(2003).BILOG-MG for Windows (version 3).Chicago, IL:Scientific Software International, Inc.
  26. 考選部全球資訊網
  27. 教育改革總諮議報告書(第三章綜合建議)
  28. 吳裕益(1986)。博士論文(博士論文)。台北市,國立政治大學教育研究所。
  29. 汪慧瑜、余民寧(2006)。量尺分數的另類表示方法:以國中基本學力測驗為例。測驗學刊,53(2),205-238。
  30. 國中基本學力測驗推動工作委員會
  31. 鄭明長、余民寧(1994)。各種通過分數設定方法之比較。測驗年刊,41,19-40。
  32. 謝進昌(2006)。精熟標準設定方法的歷史演進與詮釋的新概念。國民教育研究學報,16,157-193。
被引用次数
  1. 顧炳宏、溫媺純、陳瓊森(2014)。以實作評量方式探討引導發現式教學模式之學習成效―以「聲音」概念為例。科學教育學刊,22(1),57-86。
  2. 劉秀丹(2019)。國小兒童臺灣手語理解能力測驗之編製及其在啓聰學校之應用。特殊教育研究學刊,44(1),91-117。