题名

廣義逐次機率比檢定在等第制分數評量的應用

并列篇名

Application of Generalized Sequential Probability Ratio Test to Ranking Evaluation

DOI

10.7108/PT.201203.0033

作者

盧宏益(Hung-Yi Lu)

关键词

多重逐次檢定 ; 等第制 ; 廣義逐次機率比檢定 ; generalized sequential probability ratio test ; grade ranking evaluation ; sequential multi-hypothesis test

期刊名称

測驗學刊

卷期/出版年月

59卷1期(2012 / 03 / 01)

页次

33 - 48

内容语文

繁體中文

中文摘要

教育部規定,國民中小學的評量需要轉換為「五等第」分數,避免學生因為些許的分數差距而進行惡性競爭與比較,以培養學生更多元的發展。等第制分數的評量,可以轉換為統計上多重逐次檢定問題。本研究使用廣義逐次機率比檢定處理多重逐次檢定問題,並進一步應用至電腦化測驗中等第制分數的評量。研究結果顯示,在不限制測驗題數下,分類正確率皆達八成以上;而當等第類別數增加時,分類正確率相差不大,主要的影響反應在測驗題數的增加。而在有限制測驗題數的測驗情境下,利用概似比的檢定概念,更能有效處理在測驗終止時無法評定考生能力的問題。

英文摘要

To avoid destructive competition among students, Ministry of Education would like to transfer the grading system from continuity scoring system to ranking system, in which it uses only five different ratings to evaluate student's ability. Grade ranking evaluation can be viewed as a sequential multi-hypothesis test problem. In this paper, we apply generalized sequential probability ratio test (GSPRT) to solve the problem of grade ranking evaluation. We found that the correct decision rates of GSPRT is over 80% when using GSPRT to solve grading ranking evaluation problem. Moreover, we identity that using the idea of likelihood ratio to solve the problem, it usually fail to evaluate student's ability during the test period.

主题分类 社會科學 > 心理學
社會科學 > 教育學
参考文献
  1. Sobel, M., & Wald, A. (1949). A sequential decision procedure for choosing one of three hypotheses concerning the unknown mean of a normal distribution. Annals of Mathematical Statistics, 20, 502-522.
  2. Wald, A. (1947). Sequential analysis. New York, NY: Dover.
  3. Armitage, P.(1950).Sequential analysis with more than two alternative hypotheses, and its relation to discriminant function analysis.Journal of the Royal Statistical Society: Series B,12,137-144.
  4. Baker, F. B.(1990).Some observations on the metric of PC-BILOG results.Applied Psychological Measurement,14,139-150.
  5. Bock, R. D.(1972).Estimating item parameters and latent ability when responses are scored in two or more nominal categories.Psychometrika,37,29-51.
  6. Chang, Y. C. I.(2004).Application of sequential probability ratio test to computerized criterion-referenced testing.Sequential Analysis,23(1),45-61.
  7. Drasgow, F.(1989).An evaluation of marginal maximum likelihood estimation for the twoparameter logistic model.Applied Psychological Measurement,13,77-90.
  8. Guion, R. M.,Ironson, G. H.(1983).Latent trait theory for organizational research.Organizational Behavior and Human Performance,31,54-87.
  9. Haley, D. C.(1952).Technical ReportTechnical Report,Palo Alto, CA:Applied Mathematics and Statistics Laboratory, Stanford University.
  10. Hambleton, R. K.,Swaminathan, H.(1985).Item response theory: Principles and applications.Boston, MA:Kluwer-Nijhoff.
  11. Kingsbury, G. G.,Weiss, D. J.(1983).A comparison of IRT-Based adaptive mastery and a sequential mastery testing procedure.New horizons in testing: Latent trait test theory and computerized adaptive testing,New York, NY:
  12. Lazarsfeld, P. F.,Henry, N.W.(1968).Latent structure analysis.Boston, MA:Houghton Mifflin.
  13. Lord, F. M.(1952).A theory of test scores.Psychometric Monograph,7
  14. Lord, F. M.(1980).Applications of item response theory to practical testing problem.Hillsdale, NJ:Lawrence Erlbaum Associates.
  15. Lord, F. M.(Ed.),Novick, M. R.(Ed.)(1968).Statistical theories of mental tests scores.Reading, MA:Addison-Wesley.
  16. Lord, F. M.,Novick, M. R.(1968).Statistical theories of mental test scores.Reading, MA:Addison-Wesley.
  17. Masters, G. N.(1982).A rasch model for partial credit scoring.Psychometrika,47,149-174.
  18. Mislevy, R. J.,Stocking, M. L.(1989).A consumer's guide to LOGIST and BILOG.Applied Psychological Measurement,13,57-75.
  19. Rasch, G.(1960).Probabilistic models for some intelligence and attainment tests.Chicago, IL:The University of Chicago Press.
  20. Reckase, M. D.(1983).A procedure for decision making using tailored testing.New horizons in testing: Latent trait test theory and computerized adaptive testing,New York, NY:
  21. Reed, F. C.(1960).A sequential multi-decision procedure: In Proc. Symp. on. Decision Theory and Application to Electronic equipment development.Rome Air Development Center,1,42-69.
  22. Samejima, F.(1969).Estimation of latent ability using a response pattern of graded scores.Psychometric Monograph,17
  23. Siegmund, D.(1985).Sequential analysis: Tests and confidence intervals.New York, NY:Springer-Verlag.
  24. Skaggs, G.,Stevenson, J.(1989).A comparison of pseudo-bayesian and joint maximum likelihood procedures for estimating item parameters in the three-parameter IRT model.Applied Psychological Measurement,13(4),391-402.
  25. Spray, J. A.,Reckase, M. D.(1996).Comparison of SPRT and sequential bayes procedures for classifying examinees into two categories using a computerized test.Journal of Educational and Behavioral Statistics,21(4),405-414.
  26. Stone, C. A.(1992).Recovery of marginal maximum likelihood estimates in the two parameter logistic response model: An evaluation of MULTILOG.Applied Psychological Measurement,16,1-16.
  27. Wright, B. D.(1977).Solving measurement problems with the Rasch model.Journal of Educational Measurement,14,97-116.