(2009).TIMSS&PIRLS International Study Center. (2009). TIMSS 2007 international database and user guide. Retrieved June 12, 2010, from http://timss.bc.edu/TIMSS2007/idb_ug.html.http://timss.bc.edu/TIMSS2007/idb_ug.html
American Educational Research Association,American Psychological Association,National Council on Measurement in Education(1999).Standards for educational an psychological testing.Washington, DC:American Educational Research Association.
Beaton, A. E.(Ed.)(1987).,Princeton, NJ:Educational Testing Service.
Binder, D. A.(1983).On the variances of asymptotically normal estimators from complex surveys.International Statistical Review,51(3),279-292.
Binici, S.(2008).Miami, FL,Florida State University.
Brennan, R. L.(ed.)(2006).Educational measurement (4th ed.).Westport, CN:Greenwood.
Camill, G.,Shepard, L.(1994).MMSS volume 4: Methods for identifying biased test items.Thousand Oaks, CA:Sage.
Clauser, B.,Mazor, K.,Hambleton, R. K.(1993).The effects of purification of the matching criterion on the identification of DIF using the Mantel-Haenszel procedure.Applied Measurement in Education,6(4),269-279.
Cochran, W. G.(1977).Sampling techniques (3rd ed.).New York:John Wiley & Sons.
de Ayala, R.(2009).The theory and practice of item response theory.New York:Guilford Press.
De Boeck, P.(ed.),Wilson, M.(ed.)(2004).Explanatory item response models: A generalized linear and nonlinear approach.New York:Springer.
Embretson, S. E.,Reise, S. P.(2000).Item response theory for psychologists.Mahwah, NJ:Erlbaum.
Ferne, T.,Rupp, A. A.(2007).A synthesis of 15 years of research on DIF in language testing: Methodological advances, challenges, and recommendations.Language Assessment Quarterly,4(2),1-36.
Frey, A.,Hartig, J.,Rupp, A. A.(2009).An NCME instructional module on booklet designs in large-scale assessments of student achievement.Educational Measurement: Issues and Practice,28(3),39-53.
Goldstein, H.(2003).Multilevel statistical models.London:Arnold.
Hamilton, L. S.(1999).Detecting gender-based differential item functioning on a constructed-respons science test.Applied Measurement in Education,12(3),211-235.
Hamilton, L. S.,Snow, R. E.(1998).,Los Angeles:National Center for Research on Evaluation, Standards, and Student Testing, University of California.
Hauger, J. B.,Sireci, S. G.(2008).Detecting differential item functioning across examinees teted in their dominant language and examinees tested in a second language.International Journal of Testing,8(3),237-250.
Holland, P. W.(ed.),Wainer, H.(ed.)(1993).Differential item fuctioning.Hillsdale, NJ:Lawrence Erlbaum Associates.
Kalton, G.(1983).Models in the practice of survey sampling.International Statistical Review,51(2),175-188.
Kamata, A.(2001).Item analysis by the hierarchical generalized linear model.Journal of Educational Measurement,38(1),79-93.
Kamata, A.,Binici, S.(2003).Random-effect DIF analysis via hierarchical generalized linear model.The International Meeting of the Psychometric Society (IMPS),Sardinia, Italy:
Kim, W.(2003).Pennsylvania, PA,Pennsylvania State University.
Lomax, R. G.(2007).Statistical concepts: A second course (3rd ed.).Mahwah, NJ:Erlbaum.
Mapuranga, R.,Dorans, N.,Middleton, K.(2008).,Princeton, NJ:ETS.
Martin, M. O.(ed.),Kelly, D. L.(ed.)(1996).,Chestnut Hill, MA:Boston College.
McLachlan, G. J.,Peel, D.(2000).Finite mixture models.New York:Wiley.
Mislevy, R. J.(1991).Randomizaton-based inference about latent variables from complex samples.Psychometrika,56(2),177-196.
Mislevy, R. J.,Beaton, A.,Kaplan, B.,Sheehan, K.(1992).Estimating population characteristics from sparse matrix samples of item responses.Journal of Educational Measurement,29(2),133-161.
Mislevy, R.,Johnson, E.,Muraki, E.(1992).Scaling procedures in NAEP.Journal of Educational Statistics,17(2),131-154.
Muthén, L. K.,Muthén, B. O.(2007).Mplus.Los Angeles:Muthen, L. K..
Osterlind, S.(2009).Differential item fuctioning (2nd ed.).Thousand Oaks:Sage.
Pan, T.(2008).Ann Arbor, MI,Michigan State University.
Pfeffermann, D.,Skinner, C. J.,Holmes, D. J.,Goldstein, H.,Rasbash, J.(1998).Weighting for unequal selection probabilities in multilevel models.Journal of the Royal Statistical Society Series B,60,23-40.
Prowker, A.,Camilli, G.(2007).Looking beyond the overall scores of NAEP assessments: Applications of generalized linear mixed modeling for exploring value-added item difficulty effects.Journal of Educational Measurement,44(1),69-87.
Rao, C. R.(ed.),Sinharay, S.(ed.)(2006).Handbook of statistics, Vol. 26: Psychometrics.North Holland:Elsevier.
Raudenbush, S. W.,Bryk, A. S.(2002).Hierarchical linear models: Applications and data analysis methods.Thousand Oaks:Sage.
Raudenbush, S. W.,Bryk, A. S.,Cheong, Y. F.,Congdon, R.,du Toit, M.(2004).HLM 6: Hierarchical linear and nonlinear modeling.Lincolnwood, IL:Scientific Software International.
Rubin, D. B.(1987).Multiple imputation for nonresponse in sample surveys.New York:John Wiley.
Rutkowski, L.,Gonzalez, E.,Joncas, M.,von Davier, M.(2010).Secondary analyses of large-scale assessment data.Educational Researcher,39(2),142-151.
Shealy, R.,Stout, W. F.(1993).A model-based standardization approach that separates true bias/DIF from group differences and detects test bias/DTF as well as item bias/DIF.Psychometrika,58(2),159-194.
Skondral, A.,Rabe-Hesketh, S.(2004).Generalized latent variable modeling: Multilevel, longitudinal, and Structural equation models.Boca Raton, FL:Chapman & Hall}CRC.
Swaminathan, H.,Rogers, H. J.(1990).Detecting differential item functioning using logistic regression procedures.Journal of Educational Measurement,27(4),361-370.
von Davier, M.,Gonzalez, E.,Mislevy, R. J.(2010).,未出版
Wainer, H.(ed.),Braun, H. I.(ed.)(1988).Test validity.Hillsdale, NJ:Lawrence Erlbaum Associates.
Wu, M. L.,Adams, R. J.,Wilson, M. R.,Haldane, S. A.(2007).ACER ConQuest version 2.0 : generalised item response modelling software [Softwave program].Camberwell:Acer Press.
Zenisky, A.,Hambleton, R.,Robin, F.(2003).DIF detection and interpretation in large-scale science assessments: Informing item writing practices.Educational Assessment,9(1/2),61-78.
Zenisky, A.,Hambleton, R.,Robin, F.(2003).Detection of differenctial item functioning in large scale state tests: A study evaluating a two-stage approach.Educational and Psychological Measurement,63(1),51-64.
Zhang, Y.,Dorans, N.,Matthews-Lopez, J.(2005).,Princeton, NJ:ETS.
Zumbo, B. D.(1999).A handbook on the theory and methods of differential item functioning (DIF).Ottawa, Canada:Directorate of Human Resources Research and Evaluation, Department of National Defense.