英文摘要
|
The purpose of this study is to investigate what degree of influence may happen in applying the mastery standard deriving from maximum test information approach to classify mastery/non-mastery in contrast to classical fixed passing score under diverse test difficulty condition. In addition, for the convenience of test result interpretation, we propose the transformed classical test scores approach to convert IRT-0 ability to classical test score and try to discover the effect of score transformation in different types of difficult tests. After analysis, we find that we may have extreme skewed score distribution in using classical fixed passing score and produce ill classification of mastery/non-mastery. In contrast, the maximum test information approach may effectively reduce the degree of influence to maintain an acceptable 80% exact classification performance because of its characteristic of stable examinees’ ability calibration. In the aspect of score transformation, the effect may hold in approximately 95% exact classification in different types of difficult tests. Furthermore, from the view of empirical sample distribution, the result indicates that we may effectively lower the number of error classification deriving from score transformation in accompanying using maximum test information approach when extreme test difficulty happens. It is due to the maximum test information approach possesses the characteristic of adjusting mastery standard depending on different test difficulty. Finally, some conclusions and suggestions are proposed for future usage.
|
参考文献
|
-
謝進昌、余民寧(2005)。以最大測驗訊息量決定通過分數之研究。測驗學刊,52(2),149-176。
連結:
-
Angoff, W. H.,R. L. Thorndike (Ed.)(1971).Educational Measurement.Washington, D.C.:American Council on Education.
-
Berk, R. A.(1996).Standard setting: The next generation (where few psychometricians have gone before!).Applied Measurement in Education,9(3),215-235.
-
Berk, R. A.(1984).A guide to criterion-referenced test construction.Batimore, MD:The Johns Hopkins University Press.
-
Birnbaum, A.,F. M. Lord,M. R. Novick(1968).Statistical theories of mental test scores (chapters 17-20).Reading, MA:Addison-Wesley.
-
Cohen, J. A.(1960).A coefficient of agreement for nominal scales.Educational and Psychological Measurement,20,37-46.
-
Crocker, L.,Algina, J.(1986).Introduction to classical and modern test theory.NY:CBS College Publishing.
-
Ebel, R. L.(1972).Essentials of educational measurement.Englewood Cliffs, NJ:Prentice-Hall.
-
Hambleton, R. K.(1998).Proceedings of achievement levels workshop.Washington, DC:National Assessment Governing Board.
-
Hambleton, R. K.,Novick, M. R.(1973).Toward an integration of theory and method for criterion-referenced tests.Journal of Educational Measurement,10,159-170.
-
Hambleton, R. K.,Swaminathan, H.(1985).Item response theory: Principles and application.Boston:Kluwer Nijhoff Publishing.
-
Hambleton, R. K.,T. B. Gutkin,C. R. Reynolds(1990).The handbook of school psychology.New York:John Wiley & Sons.
-
Kane, M.(1994).Validating the performance standards associated with passing scores.Review of Educational Research,64(3),425-461.
-
Kane, M.,G. J. Cizek (Ed.)(2001).Standard setting: Concepts, methods, and perspectives.Mahwah, NJ:Lawrence Erlbaum Associates.
-
Kingsbury, G. G.,Weiss, D. J.(1983).New horizons in testing: Latent trait test theory and computerized adaptive testing.New York:Academic Press.
-
Nedelsky, L.(1954).Absolute grading standards for objective tests.Educational and Psychological Measurement,14,3-19.
-
Novick, M. R.,Lewis, C.,Jackson, P. H.(1973).The estimation of proportions in a groups.Psychometrika,38,19-45.
-
Reckase, M. D.(1998).Converting boundaries between National Assessment Governing Board performance categories to points on the National Assessment of Educational Progress score scale: The 1996 science NAEP process.Applied Measurement in Education,11,9-21.
-
Reckase, M. D.,D. J. Weiss (Ed.)(1983).New horizons in testing: Latent trait test theory and computerized adaptive testing.New York:Academic Press.
-
Subkoviak, M. J.(1988).A practitioner`s guide to computation and interpretation of reliability indices for mastery test.Journal of Educational Measurement,25,47-55.
-
Swaminathan, H.,Hambleton, R. K.,Algina, J.(1974).Reliability of criterion-referenced tests: A decision theoretic formulation.Journal of Educational Measurement,11,262-267.
-
Wang, N.(2003).Use of the Rasch model in standard setting: An item mapping method.Journal of Educational Measurement,40(3),231-253.
-
Weiss, D. J.,Kingsbury, G. G.(1974).Application of computerized adaptive testing to educational problem.Journal of Educational Measurement,21(4),361-375.
-
Wiberg, M.(2003).An optimal design approach to criterion-referenced computerized testing.Journal of Educational and Behavioral Statistics,28(2),97-110.
-
Zimowski, M. F.,Muraki, E.,Mislevy, R. J.,Bock, R. D.(2003).BILOG-MG for Windows (version 3).Chicago, IL:Scientific Software International, Inc.
-
考選部全球資訊網
-
教育改革總諮議報告書(第三章綜合建議)
-
吳裕益(1986)。博士論文(博士論文)。台北市,國立政治大學教育研究所。
-
汪慧瑜、余民寧(2006)。量尺分數的另類表示方法:以國中基本學力測驗為例。測驗學刊,53(2),205-238。
-
國中基本學力測驗推動工作委員會
-
鄭明長、余民寧(1994)。各種通過分數設定方法之比較。測驗年刊,41,19-40。
-
謝進昌(2006)。精熟標準設定方法的歷史演進與詮釋的新概念。國民教育研究學報,16,157-193。
|