


A Simulation Study on Using MIMIC Model to Assess the Accuracy of Differential Item Functioning




蔡良庭(Liang-Ting Tsai);楊志堅(Chih-Chien Yang);王文中(Wen-Chung Wang);施慶麟(Ching-Lin Shih)


DIF ; MIMIC ; WLSMV ; 強韌性卡方差異檢定 ; DTF ; MTMTC ; Robust chi-square difference test ; WLSMV




55卷2期(2008 / 08 / 01)


287 - 311




本文主要探討以MIMIC模式(multiple-indicators, multiple-causes model)檢查題組型的試題群DIF(Differential Item Functioning)時,在多種不同的模擬實驗設計下,檢測的錯誤率、正確率及樣本數間的關係。MIMIC模式在1975年由Jöreskog與Goldberger提出,該模型可用來檢測依變數、獨立變數及潛在變數的間接或直接關係。另一延伸的應用可將MIMIC模式用來進行不同樣本間的測驗(measurement)或結構(structural)比較分析。本研究即利用強韌性卡方差異檢定(robust chi-square difference test)於評估MIMIC模式的適配度時,此檢定方法之整體品質。模擬研究結果顯示,當使用MIMIC模式進行試題差異檢定時,強韌性卡方差異檢定法之檢測正確率會隨著試題數、試題難度差異量、試題差異比例增加而上升。


The main aims of this paper are to study effects of inaccuracy, accuracy, and sample sizes when the multiple-indicators, multiple-causes (MIMIC) model was used for performing difference item functioning (DIF) in some designed item groups of simulation study. The MTMTC model, proposed by Jöreskog and Goldberger (1975), can be utilized to evaluate direct/indirect relationships among dependent, independent, and latent variables as well as to compare measurement or structural difference between samples. To achieve the evaluation purpose, robust chi-square difference test was used to address overall qualities of the goodness-of-fit index of MIMIC models. Results by the simulation study show when MIMIC models are used to detect DIF between groups, accuracy of DIF identification can be increased as the test length, DIF magnitudes, percentages of DIF items in the test are increased.

主题分类 社會科學 > 心理學
社會科學 > 教育學
  1. Asparouhov, T.,Muthén, B.(2006).Robust chi square difference testing with mean and variance adjusted test statistics.Mplus Web Notes,10
  2. Bollen, K. A.(1989).Structural equations with latent variables.New York:John Wiley & Sons.
  3. Bradley, A. H.,Anton, A. B.(2002).Obtaining a common scale for item response theory item parameters using separate versus concurrent estimation in the common-item equating design.Applied Psychological Measurement,26(1),3-24.
  4. Byran, B. M.(2001).Structural equation modeling with AMOS: Basic concepts, applications, and programming.Mahwah, NJ:Lawrence Erlbaum Associates.
  5. Doolittle, A. E.,Cleary, T. A.(1987).Gender-based differential item performance in mathematics achievement items.Journal of Educational Measurement,24,157-166.
  6. Embretson, S. E.,Reise, S. P.(2000).Item response theory for psychologists.Mahwah, NJ:Lawrence Erlbaum Associates.
  7. Finch, H.(2005).The MIMIC model as a method for detecting DIF: Comparison with Mantel-Haenszel, SIBTEST, and the IRT likelihood ratio.Applied Psychological Measurement,29(4),278-295.
  8. Glöckner-Rist, A.,Hoitjink, H.(2003).The best of both worlds: Factor analysis of dichotomous data using item response theory and structural equation modeling.Structural Equation Modeling,10(4),544-565.
  9. Hambelton, R. K.,Swaminathan, H.(1985).Item response theory: Principles and application.Hingham, MA:Kluwer, Nijhoff.
  10. Harris, A. M.,Carlton, S. T.(1993).Patterns of gender differences on mathematics items on the SAT.Applied Measurement in Education,6,137-151.
  11. Holman, R.,Glas, C. A. W.,de Haan, R. J.(2003).Power analysis in randomized clinical trials bases on item response theory.Controlled Clinical Trials,24,390-410.
  12. Jöreskog, K. G.,Goldberger, A. S.(1975).Estimation of a model with multiple indicators and multiple causes of a single latent variable.Journal of the American Statistical Association,10,631-639.
  13. Lane, S.,Wang, N.,Magone, M.(1996).Gender-related differential item functioning on a middle-school mathematics performance assessment.Educational Measurement: issues and Practice,15(4),121-127.
  14. MacIntosh, R.,Hashim, S.(2003).Variance estimation for converting MIMIC model parameters to IRT parameters in DIF analysis.Applied Psychological Measurement,27(5),372-379.
  15. Muthén, B. O.(1985).A method for studying the homogeneity of test items with respect to other relevant variables.Journal of Educational Statistics,10,121-132.
  16. Muthén, B. O.,H. Wainer,H. Braun (Eds.)(1988).Test validity.Hillsdale, NJ:Lawrence Erlbaum Associates.
  17. Muthén, B. O.,Kao, C.,Burstein, L.(1991).Instructionally sensitive psychometrics: An application of a new IRT-based detection technique to mathematics achievement test items.Journal of Educational Measurement,28,1-22.
  18. Muthén, B. O.,Lehman, J.(1985).Multiple group IRT modeling: Applications to item bias analysis.Journal of Educational Statistics,10,133-142.
  19. Muthén, B.,du Toit, S. H. C.,Spisic, D.(1997).Psychometrika.
  20. Muthén, L. K.,Muthén, B. O.(2004).Mplus user`s guide.Los Angeles:Muthen & Muthen.
  21. O`Neil, K. A.,McPeek, W. M.,P. W. Holland,H. Wainer (Eds.)(1993).Differential Item Functioning.Mahwah, NJ:Lawrence Erlbaum Associates.
  22. Stocking, M. L.,Lord, F M.(1983).Developing a common metric in item response theory.Applied Psychological Measurement,7,201-210.
  23. Takane, Y.,de Leeuw, J.(1987).On the relationship between item response theory and factor analysis of discretized variables.Psychometrika,52,393-408.
  24. Wang, W. C.,Yeh, Y. L.(2003).Effects of anchor item methods on differential item functioning detection with the likelihood ratio test.Applied Psychological Measurement,27,479-498.
  25. 林奕宏、林世華(2004)。國小高年級數學科成就測驗中與性別有關的DIF現象。台東大學教育學報,15(1),67-96。
  1. 蔡良庭、楊志堅(2008)。評估取樣權重於檢定Likert問卷之測量恆等性研究。中華心理學刊,50(3),257-269。
  2. 蔡良庭、楊志堅(2013)。運用S-B量尺卡方差異檢定於潛在變項交互作用檢定之研究。中華心理學刊,55(2),277-290。
  3. (2009)。RMSEA與卡方差異法於檢定模式差異之正確性:測量不變性之檢定。教育與心理研究,32(4),53-72。