题名

垂直等化連結特性之研究:六種連結方法的比較

并列篇名

A Comparison of the Differences in Linked Score Properties for Vertical Scaling Obtained Using Six Linking Methods

DOI

10.7108/PT.201112.0002

作者

陳煥文(Huan-Wen Chen)

关键词

垂直等化 ; 相同分數分配特性 ; 連結方法 ; 連結關係 ; 等值特性 ; equity properties ; linking methods ; linking relationships ; same distribution property ; vertical scaling

期刊名称

測驗學刊

卷期/出版年月

58卷4期(2011 / 12 / 01)

页次

559 - 583

内容语文

繁體中文

中文摘要

Six linking methods were used to link the grade 4 math test to grade 6 math test using common-item design. The six linking methods included in this study were four observed score methods (frequency estimation, Tucker observed linear, Levine observed linear, and IRT observed) and two true score methods (IRT true and Levine true linear). The criteria used for comparisons included similarities of the linking relationships, the equal distribution property, first order equity, and second order equity. The two tests for grade 4 and grade 6 are computerized math tests consisting of twenty-five and twenty-seven items, respectively. Among the items in the two tests, five of them are common items. The results indicated the existence of curvilinear linking relationship between the two tests. In general, first order equity tended to be more closely achieved by the true score methods than by the observed score methods. By contrast, the same distribution property and second order equity property were more closely achieved by the observed score methods than by the true score methods.

英文摘要

Six linking methods were used to link the grade 4 math test to grade 6 math test using common-item design. The six linking methods included in this study were four observed score methods (frequency estimation, Tucker observed linear, Levine observed linear, and IRT observed) and two true score methods (IRT true and Levine true linear). The criteria used for comparisons included similarities of the linking relationships, the equal distribution property, first order equity, and second order equity. The two tests for grade 4 and grade 6 are computerized math tests consisting of twenty-five and twenty-seven items, respectively. Among the items in the two tests, five of them are common items. The results indicated the existence of curvilinear linking relationship between the two tests. In general, first order equity tended to be more closely achieved by the true score methods than by the observed score methods. By contrast, the same distribution property and second order equity property were more closely achieved by the observed score methods than by the true score methods.

主题分类 社會科學 > 心理學
社會科學 > 教育學
参考文献
  1. 劉湘川(2002)。高階相關比累進加權核平滑化試題選項綜合模式。測驗統計年刊,10,197-218。
    連結:
  2. ACT(1997).1998 civics and writing level-setting methodologies.Iowa City, IA:ACT.
  3. Allen, N. L.,Donoghue, J. R.,Schoeps, T. L.(2001).The NAEP 1998 Technical Report.Washington, DC:Natioanl Center for Education Statistics.
  4. Brennan, R. L.(Ed.)(2006).Educational measurement.Westport, CT:Praeger.
  5. Chen, H. W.(2001).Iowa City, IA,The University of Iowa.
  6. Chen, H. W.,Kolen, M. J.(2000).Linking ITBS survey math battery to complete math battery using five different linking methods.annual meeting of the National Council on Measurement in Education,New Orleans, LA:
  7. Chen, H.W.,Kolen,M. J.(2001).Calibration of the ITBS survey language battery to the ITBS complete language battery using five different linking methods.annual meeting of the National Council on Measurement in Education,Seattle, WA:
  8. Cook, L. L.,Eignor, D. R.(1991).An NCME instructional module on IRT equating methods.Educational Measurement: Issues and Practice,10,37-45.
  9. Donahue, P. L.,Finnegan, R. J.,Lutkus, A. D.,Allen, N. L.,Campbell, J. R.(2000).The nation's report card: Fourthe-grade reading 2000.Washington, DC:National Center for Education Statistics.
  10. Dorans, N. J.(ed.),Pommerich, M.(ed.),Holland, P. W.(ed.)(2007).Linking and aligning scores and scales.NewYork, NY:Springer-Verlag.
  11. Dunbar, S. B.,Han,M. Y.,Hoover, H. D.(1992).A comparison of composite scaled scores from long and short versions of the same test battery.annual meeting of the National Council on Measurement in Education,San Francisco, CA:
  12. Han, M. Y.(1991).Iowa City, IA,The University of Iowa.
  13. Hanson, B. A.(1991).A note on Levine's formula for equating unequally reliable tests using data from the common item nonequivalent groups design.Journal of Educational Statistics,16,93-100.
  14. Kolen, M. J.,Brennan, R. L.(2004).Test equating, linking, and scaling: Methods and practices.New York, NY:Springer-Verlag.
  15. Kolen, M. J.,Brennan, R. L.(1995).Test equating: Methods and practice.New York, NY:Springer-Verlag.
  16. Kolen, M. J.,Zeng, L.,Hanson, B. A.(1996).Conditional standard errors of measurement for scale scores using IRT.Journal of Educational Measurement,33,129-140.
  17. Linn, R. L.(1993).Linking results of distinct assessments.Applied Measurement in Education,6,83-102.
  18. Linn, R. L.(Ed.)(1989).Educational measurement.New York:American Council on Education and Macmillan.
  19. Lord, F. M.(1965).A strong true score theory with applications.Psychometrika,30,239-270.
  20. Lord, F. M.,Wingersky, M. S.(1984).Comparison of IRT true-score and Equipercentile observed-score "equating".Applied Psychological Measurement,8,452-461.
  21. Mislevy, R. J.(1992).Linking educational assessments: Concepts, issues, methods, and, prospects.Princeton, NJ:Educational Testing Service, Policy Information Center.
  22. National Research Council(1999).Uncommon measures: Equivalency and linkage of educational tests.Washington, DC:National Research Council.
  23. Pellegrino, J. W.(ed.),Jones, L. R.(ed.),Mitchell, K. J.(ed.)(2000).Grading the nation's reportcard: Research from the evaluation of NAEP.Washington, DC:National Academy Press.
  24. Thorndike, R. L.(Ed.)(1971).Educational measurement.Washington, DC:American Council on Education.
  25. von Davier, A. A.,Holland, P.W.,Thayer, D. T.(2004).The kernel method of test equating.New York, NY:Springer.
  26. Yen, W. M.(1983).Tau-equivalence and equipercentile equating.Psychometrika,48,353-369.
  27. 王雅苓(1999)。碩士論文(碩士論文)。彰化市,國立彰化師範大學數學系。
  28. 洪宿芬(2009)。碩士論文(碩士論文)。台中市,國立台中教育大學教育測驗統計研究所。
  29. 洪碧霞(1996)。教育部委託專題研究計畫成果報告教育部委託專題研究計畫成果報告,教育部。
  30. 教育部(2000)。國民中小學九年一貫課程暫行綱要。台北市:教育部。
  31. 劉湘川(2007)。多點記分核平滑化無參數IRT及其應用。測驗統計年刊,15,13-28。
  32. 蘇少祖(2007)。碩士論文(碩士論文)。台中市,國立台中教育大學教育測驗統計研究所。
被引用次数
  1. 曾建銘(2020)。三~八年級資料與可能性能力測驗的發展及信效度分析。測驗學刊,67(4),301-331。
  2. 楊心怡,陳柏熹,吳昭容,吳宜玲(2021)。三至九年級學生數學運算能力等化測量與多向度分析。清華教育學報,38(2),111-150。
  3. (2018)。領域特定詞彙知識的測量:三至八年級學生數學詞彙能力。教育研究與發展期刊,14(4),1-40。