题名

英文學科能力測驗選擇題之性別差異與差異試題功能分析

并列篇名

Gender Differences and Differential Item Functioning on the English GSAT Multiple-Choice Questions

作者

廖彥棻(Yen-Fen Liao)

关键词

英文科學科能力測驗 ; 性別差異 ; 差異試題功能分析 ; 測驗公平性 ; 結構效度 ; the English GSAT ; gender differences ; differential item functioning analysis ; test fairness ; construct validity

期刊名称

東吳外語學報

卷期/出版年月

41期(2015 / 09 / 01)

页次

21 - 59

内容语文

繁體中文

中文摘要

學科能力測驗(簡稱學測)在臺灣為一重要而普及的大學入學測驗,因此該測驗是否具有信度、效度與公平性等議題,值得進一步探討。其中,影響測驗效度與公平性的可能因素之一即為考生的性別;不同性別的考生測驗表現是否存在顯著差異?尤其是性別不同,但能力相同的個人,是否在答對某個試題上的機率有所不同,因而呈現差異試題功能(DIF)的現象?換句話說,是否學測的考題可能會對某種性別的考生特別有利?然而目前文獻較少有針對大學學測英文考科是否存有性別差異與DIF試題的相關研究。因此,本研究主要目的即在比較男女考生在九十八學年度大學學測英文考科選擇題的測驗表現;並進一步採用M-H分析法(Mantel-Haenszel analysis)與logistic迴歸分析(logistic regression),進行差異試題功能分析檢測,逐一檢視各題是否因性別而存在DIF。研究結果顯示女性考生的英文學測成績略高於男性考生,並發現少數幾題具有性別DIF的現象,但尚不構成明顯的試題偏誤。

英文摘要

The General Scholastic Ability Test (GSAT) is a widely-used college entrance exam in Taiwan. It is thus more critical than ever that the test scores be reliable and that the inferences drawn from the results be valid and fair. One of the factors influencing test validity and fairness is gender. The major concern is to examine if there are significant gender differences in test scores and if there are items which favor a certain group. In other words, if female test takers with the same trait level as male counterparts have different expected scores on the same item, differential item functioning (DIF) may be present in the item. To date, these issues are yet to receive a lot of attention in the English GSAT context. The aim of the current study is to investigate gender differences in the 98 English GSAT multiple-choice scores and detect gender DIF items using the Mantel-Haenszel method and logistic regression analysis. The results showed that female test-takers scored slightly higher than males in the English GSAT. Few potentially gender DIF items were flagged, but no systematic bias was detected.

主题分类 人文學 > 語言學
人文學 > 外國文學
参考文献
  1. Gong, B.(2010).An Analysis of Problems of Taiwanese IELTS Test Candidates in their Speaking Sub-test.Hwa Kang English Journal,16,141-167.
    連結:
  2. Lee, M.(2011).Gender Difference in EFL Learners' Spoken Discourse-A Case Study.外國語文研究,13,1-31.
    連結:
  3. 余民寧、謝進昌(2006)。國中基本學力測驗之 DIF 實徵分析:以 91 年度兩次測驗為例。教育學刊,26,241-276。
    連結:
  4. 邱秀惠(2008)。全民英檢圖片題聽力測驗是否有性別表現上的差異?。嘉南學報,34,406-422。
    連結:
  5. 盧雪梅(2009)。國中基本學力測驗社會科之性別差異和差別試題功能(DIF)分析。臺東大學教育學報,20(2),31-61。
    連結:
  6. 盧雪梅、毛國楠(2008)。國中基本學力測驗自然科之性別差異和差別試題功能(DIF)分析。測驗學刊,5(4),725-759。
    連結:
  7. 蕭偉智、傅家珍(2012)。國中八年級自然科定期評量之性別差別試題功能(DIF)分析。新竹教育大學教育學報,29(2),35-64。
    連結:
  8. Educational Testing Service. Test and Score Data Summary for TOEFL iBT Tests:January 2013-December 2013 Test Data. Princeton, NJ:Educational Testing Service, 2014
  9. IELTS. Researchers-Test Taker Performance 2012. Retrieved March 12, 2015, from http://www.ielts.org/researchers/analysis-of-test-data/test-taker-performance-2012.aspx
  10. Abbott, M. L.(2007).A Confirmatory Approach to Differential Item Functioning on an ESL Reading Assessment.Language Testing,24(1),7-36.
  11. Ahmadi, A.,Mansoordehghan, S.(2012).Comprehending a Non-text: A Study of Gender-based Differences in EFL Reading Comprehension.Journal of Language Teaching and Research,3(4),761-770.
  12. Ay, S.,Bartan, Özgür S.(2012).The Effect of Topic Interest and Gender on Reading Test Types in a Second Language.The Reading Matrix,12(1),62-79.
  13. Bachman, L. F.,Palmer, A. S.(1996).Language Testing in Practice.Oxford:Oxford University Press.
  14. Bacon, S. M.(1992).The Relationship between Gender, Comprehension, Processing Strategies, and Cognitive and Affective Response in Foreign Language Listening.The Modern Language Journal,76(2),160-178.
  15. Bailey, K. M.(1998).Learning about Language Assessment: Dilemmas, Decisions, and Directions.Boston:Heinle & Heinle.
  16. Bichi, A. A.,Embong, R.,Mamat, M.,Maiwada, D. A.(2015).Comparison of Classical Test Theory and Item Response Theory: A Review of Empirical Studies.Australian Journal of Basic and Applied Sciences,9(7),549-556.
  17. Bügel, K.,Buunk, B. P.(1996).Sex Differences in Foreign Language Text Comprehension: The Role of Interests and Prior Knowledge.The Modern Language Journal,80(1),15-31.
  18. Cheng, H.,Lee, F.,Liou, P.,Chung, W.(2010).Are Female Better Language Learners?-A Cases Study.WHAMPOA-An Interdisciplinary Journal,59,55-72.
  19. Chiu, P. C.(2008).Kansas, U.S.A.,University of Kansas.
  20. Cohen, J.(1988).Statistical Power Analysis for the Behavioral Sciences.Hillsdale, NJ:Lawrence Erlbaum Associates, Inc..
  21. Ebel, R. L.,Frisbie, D. A.(1991).Essentials of Educational Measurement.Englewood Cliffs, NJ:Prentice-Hall, Inc..
  22. Farhady, H.(1982).Measures of Language Proficiency from the Learner's Perspective.TESOL Quarterly,16(1),43-59.
  23. Ferne, T.,Rupp, A.(2007).A Synthesis of 15 Years of Research on DIF in Language Testing: Methodological Advances, Challenges, and Recommendation.Language Assessment Quarterly: An International Journal,4,113-148.
  24. Fidalgo, A. M.,Ferreres, D.,MuNiz, J.(2004).Utility of the Mantel-Haenszel Procedure for Detecting Differential Item Functioning in Small Samples.Educational and Psychological Measurement,64,925-936.
  25. Holland, P. W.(1985).On the Study of Differential Item Difficulty without IRT.Proceedings of the 27 th Annual Conference of the Military Testing Association,San Diego, CA:
  26. Holland, P. W.(Ed.),Wainer, H.(Ed.)(1993).Differential Item Functioning.Hillsdale, NJ:Lawrence Erlbaum.
  27. Holland, P. W.,Thayer, D.(1986).Differential Item Performance and the Mantel-Haenszel Procedure.67 th annual meeting of the American Educational Research Association,San Francisco, CA:
  28. Hopkins, K. D.(1998).Educational and Psychological Measurement and Evaluation.Needham Heights, MA:Allyn & Bacon.
  29. James, C. L.(2010).Do Language Proficiency Test Scores Differ by Gender?.TESOL Quarterly,44(2),387-398.
  30. Jodoin, M. G.,Gierl, M. J.(2001).Evaluating Type I Error and Power Rates Using an Effect Size Measure with the Logistic Regression Procedure for DIF Detection.Applied Measurement in Education,14(4),329-349.
  31. Kamata, A.,Vaughn, B. K.(2004).An Introduction to Differential Item Function Analysis.Learning Disabilities: A Contemporary Journal,2(2),49-69.
  32. Kondratek, B.,Grudniewska, M.(2014).Comparison of Mantel-Haenszel with IRT Procedures for DIF detection and Effect Size Estimation for Dichotomous Items.EDUKACJA: An Interdisciplinary Approach,5(130),92-111.
  33. Kunnan, A. J.(1992).An Investigation of a Criterion-referenced Test Using G-theory, and Factor and Cluster Analyses.Language Testing,9,30-49.
  34. Lin, J.,Wu, F.(2003).Differential Performance by Gender in Foreign Language Testing.Annual Meeting of the National Council on Measurement in Education,Chicago, IL,:
  35. Mantel, N.,Haenszel, W.(1959).Statistical Aspects of the Analysis of Data from Retrospective Studies of Disease.Journal of the National Cancer Institute,22,719-748.
  36. Pae, T.-I.(2012).Causes of Gender DIF on an EFL Language Test: A Multiple-data Analysis over Nine Years.Language Testing,29(4),533-554.
  37. Pae, T.-I.(2004).Gender Effect on Reading Comprehension with Korean EFL Learners.System,32,265-281.
  38. Pae, T.-I.,Park, G..-P.(2006).Examining the Relationship between Differential Item Functioning and Differential Test Functioning.Language Testing,23(4),475-496.
  39. Park, G.-P.(2008).Differential Item Functioning on an English Listening Test across Gender.TESOL Quarterly,42(1),115-123.
  40. Park, T.(2006).Detecting DIF across Differential Language and Gender Groups in the MELAB Essay Test Using the Logistic Regression Method.Spaan Fellow Working Papers in Second or Foreign Language Assessment,4,87-96.
  41. Pellegrino, J. W.(Ed.)(2001).Knowing What Students Know: The Science and Design of Educational Assessment.Washington, DC:National Academies Press.
  42. Penfield, R. D.(2007).An Approach for Categorizing DIF in Polytomous Items.Applied Measurement in Education,20(3),335-355.
  43. Polat, N.(2010).Gender Differences in Motivation and L2 Accent Attainment: An Investigation of Young Kurdish Learners of Turkish.The Language Learning Journal,39(1),19-41.
  44. Rao, Z.(2005).Gender, Academic Major, and Chinese Students' Use of Language Learning Strategies: Social and Educational Perspectives.The Journal of Asia TEFL,2(3),115-138.
  45. Reckase, M. D.(1979).Unifactor Latent Trait Models Applied to Multifactor Tests: Results and Implications.Journal of Educational Statistics,4,207-230.
  46. Ryan, K. E.,Bachman, L. F.(1992).Differential Item Functioning on Two Tests of EFL Proficiency.Language Testing,9(1),12-29.
  47. Sireci, S. G.,Allalouf, A.(2003).Appraising Item Equivalence across Multiple Languages and Cultures.Language Testing,20,148-166.
  48. Song, X.,Cheng, L.,Klinger, D.(2015).DIF Investigations across Groups of Gender and Academic Background in a Large-scale High-stakes Language Test.Papers in Language Testing and Assessment,4(1),97-124.
  49. Swaminathan, H.,Rogers, J.(1990).Detecting Differential Item Functioning Using Logistic Regression Procedures.Journal of Educational Measurement,27,361-370.
  50. Takala, S.,Kaftandjieva, F.(2000).Teat Fairness: A DIF Analysis of an L2 Vocabulary Test.Language Testing,17,323-340.
  51. Tung, H. C.(2008).Kaohsiung, Taiwan, R.O.C.,National Koahsiung Normal University.
  52. Uiterwijk, H.,Vallen, T.(2005).Linguistic Sources of Item Bias for Second Generation Immigrants in Dutch Tests.Language Testing,22,211-234.
  53. Wainer, H.,Lukhele, R.(1997).How Reliable are TOEFL Scores?.Educational and Psychological Measurement,57(5),741-759.
  54. Wang, S.(2006).Working Papers in Second or Foreign Language Assessment 4Working Papers in Second or Foreign Language Assessment 4,Ann Arbor, MI:University of Michigan English Language Institute.
  55. Wu, R. W.(2009).Differential Item Functioning in Gender and Living Background Groups in the GEPT.Proceedings of the 13th International Conference on Language Education,Kaohsiung:
  56. Young, D. J.,Oxford, R.(1997).A Gender-related Analysis of Strategies Used to Process Input in the Native Language and a Foreign Language.Applied Language Learning,8,43-73.
  57. Yu, C. C.(2010).Meeting Challenges and Upgrading Tests for College Entrance.12 th Academic Forum on English Language Testing in Asia
  58. Zumbo, B. D.(1999).A Handbook on the Theory and Methods of Differential Item Functioning (DIF): Logistic Regression Modeling as a Unitary Framework for Binary and Likert-type (Ordinal)Item Scores.Ottawa, Ontario, Canada:Directorate of Human Resources Research and Evaluation, Department of National Defense.
  59. 余民寧(2009)。試題反應理論(IRT)及其應用。臺北:心理出版社。
  60. 余光雄(2008)。探就台灣大專入學英文科試題與語言能力測試理論及高中/職的英文教學之關係。研習資訊雙月刊,25(2),11-26。
  61. 周思余(2009)。碩士論文(碩士論文)。中正大學。
  62. 林秀慧(2009)。九十八學年度學科能力測驗試題分析英文考科。臺北:大學入學考試中心。
  63. 林奕宏、林世華(2004)。國小高年級數學科成就測驗中與性別有關的DIF現象。臺東大學教育學報,15(1),67-96。
  64. 夏林清、蕭次融、劉澄桂(2005)。九十至九十三學年度學科能力測驗、指定科目考試試題差別功能檢核計畫。臺北:大學入學考試中心。
  65. 國立臺灣師範大學心理與教育測驗研究發展中心(2009)。九十五~九十七年國中基本學力測驗性別 DIF 分析。飛揚,55,20-23。
  66. 陳秀娟(2008)。碩士論文(碩士論文)。臺灣師範大學。
  67. 管美蓉(2010)。大學入學考試問題回顧—從輿論觀點分析(1962-2000)。考試學刊,8,147-173。
  68. 盧珍予(2002)。碩士論文(碩士論文)。政治大學。
  69. 盧雪梅(2007)。國民中學學生基本學力測驗國文科和英語科成就性別差異和性別差別試題功能(DIF)分析。教育研究與發展期刊,3(4),79-111。
  70. 盧雪梅(1999)。差別試題功能(DIF)的檢定方法。台北市立師範學院學報,30,149-166。
  71. 蕭次融、劉澄桂、連秋華(2001)。學科能力測驗試題差別功能分析。臺北:大學入學考試中心。
  72. 藍偉華(2006)。碩士論文(碩士論文)。臺灣師範大學。
被引用次数
  1. 陳承德、孫國瑋、施慶麟(2018)。DIF成因之初探:試題特徵與差異試題功能之關聯。教育心理學報,50(2),167-188。
  2. 鄧鈞文,陳俊瑋,林仁傑(2019)。數學成就測驗的性別差異試題功能(DIF)現象:以臺灣學生學習成就評量資料為例。教育科學期刊,18(1),71-91。