Akaike, H.(1977).Factor analysis and AIC.Psychometrika,52,317-332.
American Educational Research Association, American Psychological Association, and National Council on Measurement in Education(1999).Standards for educational and psychological testing.Washington, DC:American Educational Research Association.
Angoff, W. H.,R. L. Thorndike (Ed.)(1971).Educational measurement.Washington, DC:American Council on Education.
Basford, K. E.,McLachlan, G. J.(1985).Likelihood estimation with normal mixture models.Applied Statistics,34,282-289.
Buckendahl, C. W.,Smith, R. W.,Impara, J. C.,Plake, B. S.(2002).A comparison of Angoff and bookmark standard setting methods.Journal of Educational Measurement,39(3),253-263.
Campbell, D. T.,Fiske, D. W.(1959).Convergent and discriminant validation by the multitraitmultimethod matrix.Psychological Bulletin,56,81-105.
Cizek, G. J.,G. J. Cizek (Ed.)(2001).Setting performance standards: Conxepts, methods, and perspectives.Mahwah, NJ:Lawrence Erlbaum Associates.
Crocker, L.,Algina, J.(1986).Introduction to classical and modern test theory.NY:Holt, Rinehart and Winston.
Ebel, R. L.(1972).Essentials of educational measurement.NJ:Prentice-Hall.
Eckhout, T. J.,Plake, B. S.,Smith, D. L.,Larsen, A.(2007).Aligning a state's alternative standards to regular core content standards in reading and mathematics: A case study.Applied Measurement in Education,20(1),79-100.
Everitt, B. S.,Hand, D. J.(1981).Finite mixture distributions.London:Chapman and Hall.
Flanagan, J. C.,E. F. Lindquist (Ed.)(1951).Educational measurement.Washing, DC:American Council on Education.
Green, D. R,Trimble, C. S.,Lewis, D. M.(2003).Interpreting the results of three different standard setting procedures.Educational Measurement: Issues and Practice,22(1),22-32.
Hambleton, R. K.,G. J. Cizek (Ed.)(2001).Setting performance standards: Conxepts, methods, and perspectives.Mahwah, NJ:Lawrence Erlbaum Associates.
Hambleton, R. K.,Swaminathan, H.(1985).Item response theory: Principle and application.Massachusetts:Kluwer Academic.
Huynh, H(2006).A clarification on the response probability criterion RP67 for standard settings based on bookmark and item mapping.Educational Measurement, Issues and Practice,25(2),19-20.
Huynh, H.(1998).On score locations of binary and partial credit items and their applications to item mapping and criterion referenced interpretation.Journal of Educational and Behavioral Statistics,23,35-56.
Jaeger, R. M.(1982).An iterative structured judgment process for establishing standards on competency tests: Theory and application.Educational Evaluation and Policy Analysis,4,461-476.
Jaeger, R. M.,R. L. Linn.(Ed.)(1989).Educational measurement.NY:American Council on Education/Macmillan.
Kaplan, D.(1995).The impact of BIB spiraling-induced missing data patterns on goodness-of-fit tests in factor analysis.Journal of Educational and Behavioral Statistics,20(1),69-82.
Karantonis, A.,Sireci, S. G.(2005).The bookmark standard-setting method: A literature review.Educational Measurement, Issues and Practice,25(1),4-12.
Koffler, S. L.(1980).A comparison of approaches for setting proficiency standards.Journal of Educational Measurement,17,167-178.
Koski, W. S.,Weis, H. A.(2004).What educational resources do students need to meet California's educational content standards? A textual analysis of California's educational content standards and their Implications for basic educational conditions and resources.Teachers College Record,106(10),1907-1935.
Lewis, D. M.,Mitzel, H. C.,Green, D. R.(1996).Standard setting: A bookmark approach. Symposium presented at the Council of Chief State School Officers National Conference on Large Scale Assessment,Phoenix, AZ:
Lewis, D. M,Mitzel, H. C.,Green, D. R.,Patz, R. J.(1999).The bookmark standard setting procedure.Monterey, CA:McGraw-Hill.
Lindsay, B. G.(1995).Mixture models: Theory, geometry, and applications.Hayward, CA:Institute of Mathematical Statistics.
Linn, R. L.(2000).Assessments and accountability.Educational Researcher,29(2),4-16.
Linn, R. L.(2003).The bookmark standard setting procedure: strength and weakness. Canada.Language Learning,52(3),537-564.
The NAEP writing achievement levels
Perie, M.(2005).Angoff and bookmark methods.Workshop presented at the annual meeting of the National Council on Measurement in Education,Montreal, Canada:
Reckase, M. D.(2006).A conceptual framework for a psychometric theory for standard setting with examples of its use for evaluating the functioning of two standard setting methods.Educational Measurement, Issues and Practice,25(2),4-18.
Shrout, P. E.(1988).Measurement reliability and agreement in psychiatry.Statistical Methods in Medical Research,7,301-317.
Sim, J.,Wright, C. C.(2005).The Kappa statistic in reliability studies: Use, interpretation, and sample size requirements.Physical Therapy,85(3),257-268.
Skaggs, G.,Tessema, A.(2001).Item disordinality with the bookmark standard setting procedural.Paper presented at the 2001 annual meeting of the national council on measurement in education,Seattle, WA.:
Swaminathan, H.,Hambleton, R. K.,Algina, J.(1974).Reliability of criterion referenced tests: A decision theoretic formulation.Journal of Educational Measurement,11,263-268.
U. S. Department of Education(1996).Goals 2000: A progress report.
Vermunt, J. K.,Magidson, J.(2005).Technical guide for Latent GOLD 4.0: Basic and advanced.Belmont, MA:Statistical Innovations.
Overseeing the nation's report card: The creation and evolution of the national assessment governing board (NAGB)
Yin, P.,Schulz, E. M.(2005).A comparison of cut scores and cut score variability from Angoff-based and Bookmark-based procedures in standard setting.Paper presented at the annual meeting of the National Council on Measurement in Education,Montreal, Canda: