Hanson, B. A., Zeng, L.,&Chien, Y. (2004). PIE: IRT true and observed scoring equating for dichotomously scored tests [Computer software]. Retrieved March 10, 2011, from http://www.education.uiowa.edu/casma
Hanson, B. A., Zeng, L., & Chien, Y. (2004). ST: A computer program for IRT scale transformation [Computer software]. Retrieved March 10, 2011, from http://www.education.uiowa.edu/casma
教育部統計處(2010)。99 學年度國中學生、教職員統計。2011 年5 月23 日,取自http://www.edu.tw/statistics/
Brennan, R. L.,Kolen, M. J.(1987).Some practical issues in equating.Applied Psychological Measurement,11,279-290.
Cook, L. L.,Petersen, N. S.(1987).Problems related to the use of conventional and item response theory equating methods in less than optimal circumstances.Applied Psychological Measurement: Issues and Practice,10,37-45.
Crocker, L.,Algina, J.(1986).Introduction to classical and modern test theory.New York, NY:Holt, Rinehart and Winston.
Dorans, N. J.,Holland, P.W.(2000).Population invariance and equatability of tests: Basic theory and the linear case.Journal of Educational Measurement,37,281-306.
Dorans, N. J.,Holland, P.W.,Thayer, D. T.,Tateneni, K.(2002).Invariance of score linking across gender groups for three Advanced Placement Program exams.Annual meeting of the National Council on Measurement in Education,New Orleans, LA:
Dorans, N. J.,Liu, J.,Hammond, S.(2008).Anchor test type and population invariance: An exploration across subpopulations and test administrations.Applied Psychological Measurement,32,81-97.
Gulliksen, H.(1950).Theory of mental tests.New York, NY:John Wiley & Sons.
Haebara, T.(1980).Equating logistic ability scales by a weighted least squares method.Japanese Psychological Research,22,144-149.
Hambleton, R. K.,Swaminathan, H.(1985).Item response theory: Principles and applications.Boston, MA:Kluwer.
Hanson, B. A.,Béguin, A. A.(2002).Obtaining a common scale for item response theory item parameters using separate versus concurrent estimation in the common-item equating design.Applied Psychological Measurement,26,3-24.
Harris, D. J.(1993).Practical issues in equating.Annual meeting of the American Educational Research Association,Atlanta, GA:
Harris, D. J.,Crouse, J. D.(1993).A study of criteria used in equating.Applied Measurement in Education,6,195-240.
Holland, P. W.(Ed.),Rubin, D. B.(Ed.)(1982).Test equating.New York, NY:Academic Press.
Kolen, M. J.,Brennan, R. L.(2004).Test equating, scaling, and linking: Methods and practices.New York, NY:Springer-Verlag.
Liu, M.,Holland, P. W.(2008).Exploring population sensitivity of linking functions across three law school admission test administrations.Applied Psychological Measurement,32,27-44.
Lord, F. M.(1980).Application of item response theory to practical testing problems.Hillsdale, NJ:Lawrence Erlbaum Associates.
Lord, F. M.,Wingersky, M. S.(1984).Comparing IRT true-score and equipercentile observed score "equatings".Applied Psychological Measurement,8,452-461.
Loyd, B. H.,Hoover, H. D.(1980).Vertical equating using the Rasch model.Journal of Educational Measurement,4,11-22.
Marco, G. L.(1977).Item characteristic curve solutions to three intractable testing problems.Journal of Educational Measurement,14,139-160.
Marco, G.,Petersen, N.,Stewart, E.(1979).A test of the adequacy of curvilinear score equating models.Computerized Adaptive Testing Conference,Minneapolis, MN:
Petersen, N. S.,Cook, L. L.,Stocking M. L.(1983).IRT versus conventional equating methods: A comparative study of scale stability.Journal of Educational Statistics,8(2),135-156.
Skaggs, G.(1990).Assessing the utility of item response theory models for testing equating.Annual meeting of the National Council on Measurement in Education,Boston, MA:
Skaggs, G.,Lissitz, R. W.(1986).IRT test equating: Relevant issues and a review of recent research.Review of Educational Research,56(4),495-529.
Stocking, M. L.,Lord, F. M.(1983).Developing a common metric in item response theory.Applied Psychological Measurement,7(2),201-211.
von Davier, A. A.,Wilson, C.(2008).Investigating the population sensitivity assumption of item response theory true-score equating across two subgroups of examinees and two test formats.Applied Psychological Measurement,32,11-26.
Yang, W.-L.(2004).Sensitivity of linkings between AP multiple-choice scores and composite scores to geographical region: An illustration of checking for population invariance.Journal of Educational Measurement,41,33-41.
Yang,W.-L.,Dorans, N. J.,Tateneni, K.(2002).Sample selection effect on AP multiplechoice score to composite score scaling.Annual meeting of the National Council on Measurement in Education,New Orleans, LA:
Yang,W.-L.,Gao, R.(2008).Invariance of score linkings across gender groups for forms of a testlet-based college-level examination program examination.Applied Psychological Measurement,32,45-61.
Yi, Q.,Harris, D. J.,Gao, X.(2008).Invariance of equating functions across different subgroups of examinees taking a Science Achievement Test.Applied Psychological Measurement,32,62-80.
Zimowski, M. F.,Muraki, E.,Mislevy, R. J.,Bock, R. D.(2003).BILOG-MG.Chicago, IL:Scientific Software International.