英文摘要
|
The purpose of this study was to explore the influence of different estimation methods based on a unidimensional three parameter logistic model. Many researches have showed that incorporating student’s background variables such as gender, age, race, and grade level into the estimation process can lead to unbiased and more precise ability estimates. This study was to explore the performance in ability estimation under different estimation methods (expected a-posteriori method, expected a-posteriori method with ancillary variable and plausible value method), and test length (15 and 30 items). In addition, the usefulness of the estimation methods was examined through its application to the Taiwan Assessment of Student Achievement 2010 eighth-grade mathematics test. The results showed that the performance of the expected a-posteriori method with ancillary variable and plausible value methods are better than that of the expected a-posteriori method when estimating the group means. The plausible value method gets better results than other methods in estimating group standard deviations. The result showed that when the test lengths increased, the estimation accuracy in abilities increased. In the real data experiment, the expected a-posteriori method with ancillary variable and plausible value method have similar result in estimating group means.
|
参考文献
|
-
陳柏熹(2006)。能力估計方法對多向度電腦化適性測驗測量精準度的影響。國立臺灣師範大學教育心理與輔導學系教育心理學報,38(2),195-211。
連結:
-
曾玉琳、王暄博、郭伯臣、許天維(2005)。不同BIB設計對測驗等化的影響。測驗統計年刊,13(2),209-229。
連結:
-
Yates, F. (1936). A new method of arranging variety trials involving a large number of varieties. J. Agric. Sci., 26, 424-455.
-
NAEP Technical Documentation (2009). The Nation's Report Card. Retrieved June 13, 2013, from http://nces.ed.gov/nationsreportcard/tdw/
-
楊孟麗、譚康榮、黃敏雄(2003)。台灣教育長期追蹤資料庫:心理計量報告:TEPS2001 分析能力測驗【第一版】。中央研究院調查研究專題中心學術調查研究資料庫,臺北市。
-
張郁雯(2009)。國際教育成就評比的心理計量議題:以PIRLS為例。2013年5月19 日,取自http://www.tmue.edu.tw/~adeva/activity_photo/photo/nation/98/981109Pirls/file/981109p.pdf
-
陳柏熹(2006a)。IRT 在量表(測驗)編製上的應用(上)。2013 年7 月19日,取自http://www.rcpet.ntnu.edu.tw/download.htm
-
Adams, R. J.,Wilson, M.,Wu, M.(1997).Multilevel item response models: An approach to errors in variables regression.Journal of Educational and Behavioral Statistics,22,47-76.
-
Bock, R. D.,Mislevy, R. J.(1982).Adaptive EAP estimation of ability in a microcomputer environment.Applied Psychological Measurement,6,431-444.
-
de la Torre, J.,Song, H.(2009).Improving the quality of ability estimates through multidimensional scoring and incorporation of ancillary variables.Applied Psychological Measurement,33,465-485.
-
Embreston, S. E.,Reise, S. P.(2000).Item response theory for psychologists.Mahwah, NJ:Lawrence Erlbaum Associates.
-
Kuehl, R. O.(2000).Design of experiments: Statistical principles of research design and analysis.CA:Duxbury Press.
-
Lee, J.,Grigg, W.,Dion, G.(2007).The Nation's Report Card: Mathematics 2007.Washington, DC:National Center for Education Statistics, Institute of Education Sciences, U.S. Department of Education.
-
Mislevy, R. J.(1991).Randomization-based inference about laten variable from complex samples.Psychometrika,56(2),177-196.
-
Mislevy, R. J.,Beaton, A. E.,Kaplan, B.,Sheehan, K. M.(1992).Estimating population characteristics form sparse matrix samples of item response.Journal of Educational Measurement,29,133-161.
-
Mislevy, R. J.,Sheehan, K. M.(1989).Information matrices in latent-variable models.Journal of Educational Statistics,14(4),335-350.
-
Organisation for Economic Co-operation and Development=OECD(2009).PISA 2006 technical report.Paris, France:Author.
-
Rubin, D. B.(1987).Multiple imputation for nonresponse in surveys.NewYork, NY:JohnWiley & Sons.
-
Rust, K. F.,Johnson, E. G.(1992).Sampling and weighting the national assessment.Journal of Educational Statistics, Special Issue: National Assessment of Educational Progress,17(2),111-129.
-
von Davier, M.,Gonzalez, E.,Mislevy, R. J.(2009).What are plausible values and why are they useful?.IERA Monograph Series: Issues and Methodologies in Large-Scale Assessment,2,9-36.
-
Wu, M.(2005).The role of plausible values in large-scale surveys.Studies in Educational Evaluation,31(2-3),114-128.
-
王敏嫻(2011)。碩士論文(碩士論文)。臺中市,國立臺中教育大學。
-
王暄博(2006)。碩士論文(碩士論文)。臺中市,國立臺中教育大學。
-
林陳涌編(2014)。,臺北市:國立臺灣師範大學科學教育中心。
-
國家教育研究院(2010)。TASA2010 年資料使用手冊。新北市:作者。
-
張鈺卿(2007)。碩士論文(碩士論文)。臺中市,國立臺中教育大學。
-
郭伯臣編、曾建銘編、吳慧編(2012)。大型標準化測驗建置流程應用於TASA 之研究。新北市:國家教育研究院。
-
郭伯臣、王暄博(2008)。大型測驗中同時進行垂直與水平等化效果之探討。教育研究與發展期刊,4,87-120。
-
郭伯臣、吳慧珉、陳俊華(2012)。試題反應理論在教育測驗上之應用。新竹縣教育研究集刊,12,5-40。
-
郭伯臣、曾建銘(2010)。,新北市:國家教育研究院籌備處。
-
曾玉琳(2005)。碩士論文(碩士論文)。臺中市,國立臺中師範學院。
-
黃美芳(2006)。碩士論文(碩士論文)。臺中市,國立臺中教育大學。
-
黃國清、吳寶桂(2006)。七年級數學標準化成就測驗之編製與其相關之研究:以IRT模式分析。教育研究與發展期刊,2(4),109-142。
-
葉昶成(2012)。碩士論文(碩士論文)。臺中市,國立臺中教育大學。
-
蘇怡婷(2009)。碩士論文(碩士論文)。臺南市,國立臺南大學。
|