题名

題型對學生數學表現水準之影響-以相似形為例

并列篇名

Effects of Item Type on Student Mathematics Performance: Similar Figures as an Example

DOI

10.6209/JORIES.202109_66(3).0008

作者

陳建亨(Chien-Heng Chen);楊凱琳(Kai-Lin Yang)

关键词

相似形 ; 測驗等化 ; 試題反應理論 ; 題目類型 ; similar figure ; test equating ; item-response theory ; item type

期刊名称

教育科學研究期刊

卷期/出版年月

66卷3期(2021 / 09 / 01)

页次

247 - 277

内容语文

繁體中文

中文摘要

本研究的目的為探討不同題型對試題難度、試題鑑別度及九年級學生數學表現水準的影響,研究者以相似形的能力指標作為測驗內容,設計相同的題目敘述之選擇題、填空題及計算說理題,並編製成含有六題共同題的三種試卷,以進行測驗等化之用。每份試卷分別有361、411及378位的受試者,合計1,150位的受試者。研究設計採用共同題不等組設計,並以同時校準法(concurrent calibration method)進行測驗等化。本研究分別用單向度二參數部分計分模式與多向度潛在迴歸模式(multidimensional latent regression model),進行試題難度、鑑別度及學生在不同題型下所表現之能力參數的估計,並將學生在各題型中的能力估計值排序換算為百分等級。研究結果發現:一、選擇題的平均難度最低,計算說理題的平均難度最高;二、選擇題的平均鑑別度最低,計算說理題的平均鑑別度最高;三、單向度模式分析發現,有69.04%的學生在三種題型能力估計值的PR值(百分等級)相差20以上;四、多向度潛在迴歸模式分析發現,有12.61%的學生在三種題型能力估計值的PR值相差15以上。研究結果顯示,測驗題與學生的數學表現水準有關,而且主要有兩種不同型態的影響。文末針對兩類學生在不同題型較具優勢的原因,做進一步的討論。

英文摘要

This study investigated the influence of item type on the difficulty and discriminatory power of items and on ninth graders' mathematics performance. Researchers tested similar figures as competence indicators; three tests were designed with multiple-choice items, completion items, and essay items that corresponded to the same stem, and six common items were equated for each test. The tests were administered to 1,150 students, with 361, 411, and 378 receiving different kind of three tests. The tests were equated using a common-item, nonequivalent group design and the concurrent-calibration method. The difficulty and discriminatory power of items and the ability parameters of the students taking the tests with different item types were estimated with a unidimensional two-parameter partial-credit model and a multidimensional latent regression model. The estimates of the students' ability parameters were converted to percentile ranks (PRs). The following are the results of the study: (1) The average difficulty of the multiple-choice items was the lowest, and that of the essay items was the highest. (2) The average discriminatory power of the multiple-choice items was the lowest, and that of the essay items was the highest. (3) For 69.04% of the students, the values of three item types differed by 20 or more in PR. (4) Finally, the multidimensional latent regression model revealed that, for 12.61% of the students, the values of three item types differed by 15 or more in PR. The results indicate that item type is related to student mathematics performance through two main types of effect. The researchers further investigated why some students' performance on the completion and essay items was superior to that on the multiple-choice items and why some students' performance on the multiple-choice items was superior to that on the essay items.

主题分类 社會科學 > 教育學
参考文献
  1. 陳映孜, Y.-T.,何曉琪, H.-C.,劉昆夏, K.-H.,林煥祥, H.-S.,鄭英耀, Y.-Y.(2017)。從教師自編科學成就測驗之Rasch分析看教與學。教育科學研究期刊,62(3),1-23。
    連結:
  2. 趙子揚, T.-Y.,黃嘉莉, J.-L.,宋曜廷, Y.-T.,郭蕙寧, H.-N.,許明輝, M.-H.(2016)。教師情境判斷測驗之編製。教育科學研究期刊,61(2),85-117。
    連結:
  3. 藍珮君, P.-J.,陳柏熹, P.-H.(2014)。華語文閱讀測驗信度效度分析與垂直等化研究。華語文教學研究,11(1),99-125。
    連結:
  4. Ajideh, P.,Mozaffarzadeh, S.(2012).C-test vs. Multiple-choice cloze test as tests of reading comprehension in Iranian EFL context: Learners’ perspective.English Language Teaching,5(11),143-150.
  5. Berg, C. A.,Smith, P.(1994).Assessing students’ abilities to construct and interpret line graphs: Disparities between multiple-choice and free-response instruments.Science Education,78(6),527-554.
  6. Bridgeman, B.(1992).A comparison of quantitative questions in open-ended and multiple-choice formats.Journal of Educational Measurement,29(3),253-271.
  7. Cox, D. C.(2013).Similarity in middle school mathematics: At the crossroads of geometry and number.Mathematical Thinking and Learning,15(1),3-23.
  8. Eduwem, J. D.,Umoinyang, I. E.(2014).Item types and upper basic education students’ performance in mathematics in the Southern Senatorial District of Cross River State, Nigeria.Journal of Modern Education Review,4(1),57-73.
  9. Evens, H.,Houssart, J.(2004).Categorizing pupils’ written answers to a mathematics test question: "I know but I can’t explain".Educational Research,46(3),269-282.
  10. Freedle, R.,Kostin, I.(1993).The prediction of TOEFL reading item difficulty: Implications for construct validity.Language Testing,10(2),133-170.
  11. Haladyna, T. M.(1997).Writing test items to evaluate higher order thinking.Allyn & Bacon.
  12. Hancock, G. R.(1994).Cognitive complexity and the comparability of multiple-choice and constructed-response test formats.The Journal of Experimental Education,62(2),143-157.
  13. Hollingworth, L.,Beard, J. J.,Proctor, T. P.(2007).An investigation of item type in a standards-based assessment.Practical Assessment Research & Evaluation,12(18)
  14. Koğar, E. Y.,Koğar, H.(2018).Examination of dimensionality and latent trait scores on mixed-format tests.PEOPLE: International Journal of Social Sciences,4(1),165-185.
  15. Kolen, M. J.,Brennan, R. L.(2014).Test equating, scaling, and linking: Methods and practices.Springer.
  16. Martinez, M. E.(1991).A comparison of multiple-choice and constructed figural response items.Joural of Educational Measurement,28(2),131-145.
  17. Mullis, I. V. S., & Martin, M. O. (Eds.). (2017). TIMSS 2019 assessment frameworks. http://timssandpirls.bc.edu/timss2019/frameworks/
  18. National Assessment Governing Board(2002).,未出版
  19. Oosterhof, A. C.,Coats, P. K.(1984).Comparison of difficulties and reliabilities of quantitative word problems in completion and multiple-choice item formats.Applied Psychological Measurement,8(3),287-294.
  20. Rylander, J.,LeBlanc, C.,Lees, D.,Schipperr, S.,Milne, D.(2018).Validating classroom assessments measuring learner knowledge of academic vocabulary.The Institute for Liberal Arts and Sciences Bulletin, Kyoto University,1,83-110.
  21. Tankersley, K.(2007).Tests that teach: Using standardized tests to improve instruction.Association for Supervision and Curriculum Development.
  22. Volodin, N. A.,Adams, R. J.(1995).Identifying and estimating a D-dimensional item response model.International Objective Measurement Workshop,Berkeley, CA, USA:
  23. Wise, S. L.,Gao, L.(2017).A general approach to measuring test-taking effort on computerbased tests.Applied Measurement in Education,30(4),343-354.
  24. Wolf, D. F.(1993).A comparison of assessment tasks used to measure FL reading comprehension.The Modern Language Journal,77(4),473-489.
  25. Wright, B. D.,Linacre, J. M.(1994).Reasonable mean-square fit values.Rasch Measurement Transactions,8(3),370.
  26. 大學入學考試中心(2011)。指定科目考試數學考科考試說明(適用99課綱)。https://www.ceec.edu.tw/files/file_pool/1/0J052618615204795377/03-102指考數學考試說明_定稿_.pdf 【College Entrance Examination Center. (2011). Introduction of advanced subjects test mathematics (99 education curricula). https://www.ceec.edu.tw/files/file_pool/1/0J052618615204795377/03-102指考數學考試說明_定稿_.pdf】
  27. 王文中, W.-C.,呂金燮, C.-H.,吳毓瑩, Y.-Y.,張郁雯, Y.-W.,張淑慧, S.-H.(1999).教育測驗與評量─教室學習觀點.五南=Wu-Nan.
  28. 余民寧, M.-N.(2009).試題反應理論(IRT)及其應用.心理=Psychological.
  29. 余民寧, M.-N.(2002).教育測驗與評量─成就測驗與教學評量.心理=Psychological.
  30. 吳宜靜, Y.-C.(2005)。國立臺南大學=National University of Tainan。
  31. 胡詩菁, S.-C.,鍾靜, C.(2015)。數學課室中應用建構反應題進行形成性評量之研究。臺灣數學教師,36(2),26-48。
  32. 國立臺灣師範大學心理與教育測驗研究發展中心(2013b)。數學科(含非選擇題題型)考試內容。https://cap.rcpet.edu.tw/test_4_4.html 【Research Center for Psychological and Educational Testing, National Taiwan Normal University. (2013b). Mathematics examination content (including non-choice items). https://cap.rcpet.edu.tw/test_4_4.html】
  33. 國立臺灣師範大學心理與教育測驗研究發展中心(2013a)。試題取材與命題原則。http://www. cap.rcpet.edu.tw/test_3.html 【Research Center for Psychological and Educational Testing, National Taiwan Normal University. (2013a). Themes and the principle of testing. http://www.cap.rcpet.edu.tw/test_3.html】
  34. 康木村, M.-T.,柳賢, H.(2004)。國中學生「相似形」迷思概念之研究。中華民國第二十屆科學教育學術研討會,高雄縣,臺灣=Kaohsiung, Taiwan:
  35. 教育部(2012)。97年國民中小學九年一貫課程綱要。https://www.k12ea.gov.tw/files/97_sid17/980424數學課程綱要修訂(單冊).pdf 【Ministry of Education. (2012). 1998 grade 1-9 curriculum guidelines. https://www.k12ea.gov.tw/files/97_sid17/980424數學課程綱要修訂(單冊).pdf】
  36. 郭生玉, S.-Y.(2004).教育測驗與評量.精華書局=Jingwha.
  37. 陳建亨, C.-H.,楊凱琳, K.-L.(2014)。題型對學生解題表現的影響─以相似形內容為例。第30屆科學教育學術研討會,臺北市,臺灣=Taipei, Taiwan:
  38. 黃國展, K.-C.(2003)。國立高雄師範大學=National Kaohsiung Normal University。
  39. 簡啟全, C.-C.(2011)。國立臺中教育大學=National Taichung University of Education。
  40. 藍珮君, P.-J.(2008)。華語文能力測驗垂直等化研究。2008台灣華語文教學年會暨研討會,花蓮縣,臺灣=Hualien, Taiwan:
被引用次数
  1. 許婉儀,張惠環,何德華(2023)。對話者之語言能力與評分嚴苛度對印尼語口語評量成績之影響。教育心理學報,55(1),25-46。