题名

電腦化適性測驗題庫擴增研究

并列篇名

The Research of Expanding Item Bank for Computerized Adaptive Test

DOI

10.6773/JRMS.200912.0019

作者

吳慧珉(Huey-Min Wu);蘇少祖(Shau-Tzu Su);趙佑軒(Yu-Hsaun Chao)

关键词

電腦適性測驗 ; 題庫 ; 試題反應理論 ; 參數估計 ; computerized adaptive test ; item bank ; item response theory ; parameter estimation

期刊名称

測驗統計年刊

卷期/出版年月

17期_下(2009 / 12 / 01)

页次

19 - 53

内容语文

繁體中文

中文摘要

本研究透過模擬研究,分析題庫更新時,如何兼顧試題參數估計的精準度及更新的速率。在試題參數估計穩定後,是否能藉由適性測驗試題與新試題的作答反應,增加能力參數估計的精準度。題庫的維護需耗費許多人力與成本,因此在題庫擴充時,若依照題庫建立之初,對新的試題重新舉行大規模的預試,將加重成本的負擔,本研究主要探討實施電腦適性測驗時,將未校準的試題逐次加入已校準的題庫中,以試題反應理論為基礎,估計未校準的試題參數和能力參數,在不增加成本的情況下,線上更新擴大題庫。 結果顯示如下: 一、新試題題數是5題或10題,對於新試題參數估計並沒有太大的影響。 二、隨著受試者人數的增加,能力參數和試題參數之估計誤差逐漸降低。 三、受試者能力值呈現常態與雙峰分佈時,藉由加入新試題估計能力值,除了可以估計出新試題參數之參數值外,更可估計出更精準的受試者的能力值。但若受試者能力真值呈現偏態分佈時,無法估算出較精準新試題之參數與受試者能力值。 四、受試者能力值呈現常態與雙峰分佈時,當加入新試題題數是10題時,其能力估計誤差比加入新試題題數是5題時較低。

英文摘要

A simulation study was conducted to evaluate the accuracies of estimating un-calibrated item parameters and how the ability estimate were affected by different factors while expanding the item bank in the process of administrating computerized adaptive test. The main idea of this research was to make examinees answer not only calibrated items but also un-calibrated ones. The abilities obtained from calibrated items were used to estimate un-calibrated items, so the item bank could be expanded without pretest procedures. In the simulation study, three factors and their varied conditions were considered: different distributions of abilities, different number of examinees, and different number of un-calibrated items. The major findings of this study were summarized as follows: 1. The root mean square errors (RMSEs) of the un-calibrated item parameters were not affected by the number of items. 2. The RMSEs of the item and the ability parameters decrease as the number of the examinees crease. 3. Under normal or binomial distribution for ability, adding un-calibrated items to estimate ability could decrease the RMSEs of ability; under skewed distribution for ability, the RMSEs of both item and ability parameter were higher than normal and binomial distributions. 4. Under normal or binomial distribution for ability, the RMSEs of ability estimated by adding 10 un-calibrated items were lower than the RMSEs of ability estimated by adding 5 un-calibrated items.

主题分类 基礎與應用科學 > 統計
社會科學 > 教育學
参考文献
  1. Baker, F. B.(2004).Item Response Theory: Parameter estimation techniques.New York:Marcel Dekker.
  2. Ban, J. C.,Hanson, B. A.,Yi, Q.,Harris, D. J.(2001).Data Sparseness and Online Pretest Item Calibration/Scaling Methods in CAT.Annual Meeting of the American Educational Research Association
  3. Birnbaum, A.,F. M. Lord,M. R. Novick(1968).Statistical theories of mental test scores.Reading, MA:Addison-Wesley.
  4. Bock, R. D.,Aitkin, M.(1981).Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm.Psychometrika,46,443-459.
  5. Bock, R. D.,Lieberman, M.(1970).Fitting a response model for n dichotomously scored items.Psychometrika,35,179-197.
  6. Chen, S, Y.,Lei, P. W.(2005).Controlling Item Exposure and Test Overlap in Computerized Adaptive Testing.Applied Psychological Measurement,29,204-217.
  7. DeMars, C. E.(2005)."Guessing" Parameter Estimates for Multidimensional IRT Models.American Educational Research Association.
  8. Donoghue, J. R.,Isham, S. P.(1996).Comparing the Effectiveness of Procedures to Detect Item Parameter Drift.Educational Testing Service.
  9. Gao, F.,Lisue, C.(2005).Bayesian or Non-Bayesian: A Comparison Study of Item Parameter Estimation in the Three-Parameter Logistic Model.Applied Measurement in Education,18(4),351-380.
  10. Glas, C. A. W.,Hendrawan. I.(2005).Testing Linear Models for Ability Parameters in Item Response Models.MULTIVARIATE BEHAVIORAL RESEARCH,40(1),25-51.
  11. Glas, C. A. W.,van der Linden, W. J.(2006).Modeling Variability in Item Parameters in Educational Measurement.Law School Admission Council Computerized Testing Report 01-07.
  12. Hulin, C. L.,Drasgow, F.,Parsons, C. K.(1983).Item response theory: Applications to psychological measurement.Homewood, IL:Irwin.
  13. Jones, D. H.,Nediak, M.(2000).Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions.Law School Admission Council Computerized Testing Report 00-05.
  14. Lindley, D. V.,Smith, A. F. M.(1972).Bayesian estimates for the linear model.Journal of the Royal Statistical Society Series B,34,1-41.
  15. Lord, F. M.(1980).Applications of item response theory to practical testing problems.Hillsdale, NJ:Lawrence Eribaum Associates.
  16. Lord, F. M.(1977).Practical applications of item characteristic curve theory.Jaurnal of educational Measurement,14,117-138.
  17. Mislevy, R. J.(1986).Bayes modal estimation in item response models.Psychometrica,51,177-195.
  18. Mislevy, R. J.,Bock, R. D.(1982).Implementation of the EM algorithm in the estimation of item parameters: The BILOG computer program.Item Response Theory and Computerized Adaptive Testing Conference Proceedings
  19. Mislevy, R. J.,Stocking, M. L.(1989).A consumer's Guide to LOGIST and BILOG.Applied Psychological Measurement,13,53-75.
  20. Ree, M. J.(1981).The effects of item calibrations, sample size, and item pool size on adaptive testing.Applied Psychological Measurement,5,11-19.
  21. Stocking, M. L.(1994).Three practical issues for modern adaptive testing item pools.ERIC Document Reproduction Service No. ED 385 551.
  22. Swaminathan. H.,Gifford, J. A.(1982).Bayesian estimation in the Rasch model.Journal of Educational Statistics,7,175-191.
  23. Tang, K. L.,Eignor. D. R.(2001).TOFEL Technique Report 17TOFEL Technique Report 17,未出版
  24. Weiss, D. J.,Kingsbury, G. G.(1984).Application of computerized adaptive testing to educational problems.Journal of Educational Measurement,21,361-375.
  25. Wolfgang, H.,Marlene, M.(2004).Nonparametric and Semiparametric Models.Heidelberge New York:
  26. Yamamoto, K.(1995).TOFEL Technique Report 10TOFEL Technique Report 10,未出版
  27. Yao, L.,Patz, R. J.,Hanson, B. A.(2002).More Effcient Markov Chain Monte Carlo Estimation in IRT Using Marginal Posteriors.National Council on Measurement in Education.
  28. Zimowski, M. F.,Muraki, E.,Mislevy, R. J.,Bock, R. D.(1996).BILOG-MG. Scientific Software lnternational.
  29. 王寶墉(1995)。現代測驗理論。台北市:心理出版社。
  30. 考選部(2007)。國家考試電腦化測驗試題研發研究報告。
  31. 余民寧(1992)。試題反應的介紹(十二)-電腦化適性測驗。研習資訊,10(5),5-9。
  32. 張雅媛(2007)。國立台中教育大學教育測驗統計研究所碩士論文,未出版。