Vol. 1, No. 2 - January 1997
Determining Alignment of Expectations and Assessments in Mathematics and Science Education 1
By Norman L. Webb
Many states and school districts are making concerted efforts to boost student achievement in mathematics and science. These are not simple face lifts, but attempts to develop deep, lasting changes in how students learn these critical subjects.
This Brief is intended for those who seek to improve student learning by creating coherent systems of expectations and assessments in states and districts. Other potential audiences are those who study reform, make decisions about reform, and are affected by reform. The intention of this Brief is to help people think more clearly about the concept of alignment, and to help them examine what is required for expectations and assessments to be in alignment.
Why Alignment Is Important
Educators, notably through efforts spearheaded by national professional associations, increasingly recognize the need for major reform in K-12 mathematics and science curricula, and are embracing a vision of ambitious content for all students. Making this vision a reality means encouraging "a far deeper and more dynamic level of instructional decision making" (Baker, Freeman, & Clayton, 1991), something that cannot be done simply by mandating new accountability measures. At the heart of these efforts to make deep changes in instruction is the concept of "alignment." The major elements of an education system must work together to help students achieve higher levels of mathematical and scientific understanding.
Educators increasingly recognize that, if policy elements are not aligned, the system will be fragmented, send mixed messages, and be less effective (CPRE, 1991; Newmann, 1993). 2 For example, the Systemic Initiatives program of the National Science Foundation seeks to help states, districts, and regions establish policies based, in part, on assessments aligned with those goals. Other examples: The U.S. Department of Educations explanation of Goals 2000: Educate America Act, and the Improving Americas Schools Act (which includes Title I), both say that alignment of curriculum, instruction, professional development, and assessments are key performance indicators for states, districts, and schools that are striving to meet challenging standards.
As more and more attention is paid to the accountability of education systems, alignment between assessments and expectations for learning becomes not only critical, but also essential. Just as a schooners speed increases when its sails are set properly, alignment among an education systems policy elements will strengthen that system, and improve what the system is able to attain. Alignment is critical to helping an education system articulate and maintain its desired course and intensity. An aligned system is better able to focus its resources and thereby strengthen its capacity for making deep, meaningful changes in instructional decision making and practice. Alignment also serves to keep local policy efforts in synch with larger-scale initiatives. 3
This Brief focuses on alignment between two major elements of education policy:
There are, of course, many other important elements in any education system, including professional development, instructional materials, college entrance requirements, teacher certification, resource allocations, and state mandates. But this Brief focuses on expectations and assessments because those elements are now of great concern among educators and policymakers, and because those are the elements at the center of most thinking about alignment to date.
Methods of Alignment
Determining alignment between expectations and assessments is difficult for several reasons. To begin with, both expectations and assessments frequently are expressed in several pieces or documents, making it difficult to assemble a complete picture. Also, it is difficult to establish a common language for describing different elements of policy. The same term may have very different meanings when used to define a goal and when used to describe something measured by assessment. Further, the policy environment in an education system can be constantly changing. New goals can be mandated, for example, while old forms of assessment are still in place. Ever-expanding content areas, expanding technology, and a growing body of research on learning also can contribute to the complexity of identifying expectations and assessments.
A review of current practice and relevant literature identifies three major approaches to assuring alignment. These are not the only approaches, however, nor should they be seen as items on a menu to be chosen and then applied in pure form. In most situations, some combination of these approaches is appropriate.
Sequential Development. Policy elements, such as expectations and assessments, are aligned by design. A set of standards, for instance, might be converted directly into specifications for developing an assessment. Once one policy element is established, it becomes the blueprint for subsequent elements. For example, the South Carolina Department of Education (1996) approved standards in a content area that are used to develop academic achievement standards (measurable outcomes), which are then used to develop assessment instruments.
One disadvantage to this approach is the amount of time needed to put a sequentially developed program in place. This approach also ignores a synergism among policy elements: The development of assessments, for example, can provide useful information for thinking about instruction and what students can be expected to learn. Another disadvantage to this approach is that it frequently does not reflect reality: In many states, the process for developing expectations and assessments is not linear or sequential, but more dynamic.
Expert Review. A panel of experts reviews the policy elements and makes some judgment on their alignment. For example, the Oregon Department of Education convened a national panel to look at various issues related to its standards (Roeber, 1996). A subpanel looked at the alignment of the planned assessments and the standards.
The format and formality of this process can vary. In many states the process is an open one, seeking input from committees and community forums of teachers, administrators, parents, and others. Whatever format is used, however, must include input from content-area specialists, because the comparisons to be made are complex and require sophisticated knowledge about how students learn.
Document Analysis. Alignment can be measured by coding and analyzing the documents that convey the expectations and assessments. A coding system must be developed that specifies the distinctions to be made in describing each document. The documents are then divided into blocks, such as individual standards, which can be described separately using the coding systems categories. Coders must be trained to independently and validly describe the documents by using the coding categories on the blocks of document information (Schmidt & McKnight, 1995; Porter, 1995). For example, the Third International Mathematics and Science Study successfully trained national teams to perform document analyses comparing curriculum materials with assessments used in the study (McKnight, Britton, Valverde, & Schmidt, 1992).
These approaches and their interactions raise questions about quality control. Sequential development, for example, frequently is controlled within an agency and therefore is less likely to include any external review. While such reviews add authority, they cant always be done within the short time lines required by legislative mandates or administrative pressures. The quality of expert review, on the other hand, depends on how qualified the reviewers are, and whether they have the opportunity to interact and build consensus. And the quality of document analysis depends on the validity of the scoring rubric being used, the quality of training, and the reliability of the coders.
Most likely, these approaches will be used in conjunction with each other. One approach will be used to verify another, or two or three approaches will be used together. An expert panel, for example, may use document analysis to judge alignment.
These approaches to the judging of alignment are strengthened by using specific criteria to assure agreement among expectations and assessments. The following criteria were identified through a review of national and state standards and alignment studies. They were adjusted after review by a panel of assessment experts from the National Institute for Science Education and the Council of Chief State School Officers, state curriculum supervisors, and others. It is expected that this set of criteria will evolve as they are used.
The five categories are intended to be a comprehensive set for judging the alignment between expectations and assessments. Each general category and all subcategories are important in ascertaining the coherence of a system, meaning the degree to which assessments and expectations converge to direct and measure student learning. In practice, reaching full agreement between expectations and assessments on all criteria is extremely difficult. Tradeoffs must be made because real constraints exist on any education system, including resources, finances, time, and legal authority. Decision makers must consider potential consequences when deciding what tradeoffs to make among these criteria, or what level of compliance will be acceptable.
Such decisions will hinge on a number of factors. Assessing the depth of content knowledge, for example, can conflict with assessing the breadth of knowledge (these concepts are explained in greater detail below). Given finite resources, it may be difficult to fully explore both. Decision makers will need to choose which criteria are considered more important within a particular context and why, and how those decisions affect the pursuit of alignment.
Because resources are finite, decision makers also will need to think broadly about expectations and assessments. It may be far more reasonable and cost-efficient, for example, to give teachers the responsibility of assessing students abilities at reasoning and problem-solving, instead of trying to measure them through new systemwide tests. Whether assessments are carried out at the classroom level, locally, or systemwide, however, the focus must be the same: achieving a high degree of match between what students are expected to know and what information is gathered on their knowledge.
The following criteria 4 are ordered to consider content first, then students, instruction, and finally system concerns.
Above all else, when using these criteria to judge the alignment of expectations and assessments in a system, a sense of reality needs to be maintained. The available resources, the amount of time available, legislative mandates, and other factors will influence how well alignment can be determined and how practical it is to make such determinations.
The alignment of expectations and assessments is a key underlying principle of systemic and standards-based reform. Establishing alignment among policy elements is an early activity for improving the potential for realizing significant reform. Those working to build aligned systems should not think too narrowly about the task. The criteria presented here demonstrate that a number of factors can be considered in judging alignment among policy elements. These can be studied in several alternative and potentially complementary ways. 5
In approaching reform, the consideration of alignment cannot come too soon. And just as educators need to remain vigilant to assure that expectations, assessments, and instructional practices are current, they also will need to review the alignment among these major policy elements as new policies are instituted, new administrative rules are imposed, and system needs are changed.
1 This Brief is the result of a collaboration between the National Institute for Science Education and the Council of Chief State School officers. The CCSSO effort is supported by Grant #9554462 from the National Science Foundation.
2 The type of alignment referred to here is "horizontal alignment," meaning the degree to which standards, frameworks, and assessments work together within an education system. This is different from another critical factor, "vertical alignment," which is the degree to which the elements of an education system are aligned with other forces, such as national standards, public opinion, work force needs, textbook content, classroom instruction, and student outcomes.
3 Alignment is intimately related to the "validity" of tests, but distinctions can be drawn between the two concepts. Alignment refers to how well all policy elements in a system work together to guide instruction and, ultimately, student learning. Validity, on the other hand, refers to the appropriateness of inferences made from information produced by an assessment. For example, the degree to which a test is aligned with a curriculum framework may affect the tests validity for a single purpose, such as making decisions on the curriculums effectiveness. But a test and a curriculum framework that are in alignment will work together to communicate a common understanding of what students are to learn, to provide consistent implications for instruction, and to represent fairness for all students, and will be based on sound principles of cognitive development.
4 A more complete discussion of these criteria, and how they can be used, is available in Webb, N. L., Criteria for Alignment of Frameworks, Standards and Student Assessments for Mathematics and Science Education. This paper is a joint publication by the National Institute for Science Education and the Council of Chief State School Officers. For more information, contact NISE at (608)263-1028 or via the NISE World Wide Web site: http://www.wcer.wisc.edu/nise.
5 The complete paper includes a more detailed description of procedures and scales useful for judging attainment of these criteria.
FOR FURTHER READING
Baker, E. L., Freeman, M., & Clayton, S. (1991). Cognitive assessment of history for large-scale testing. In M. C. Wittrock & E. L. Baker (Eds.), Testing and cognition, (pp. 131-153). Englewood Cliffs, NJ: Prentice-Hall.
Baxter, G. P., Shavelson, R. J., Herman, S. J., Brown, K. A., & Valadez, J. R. (1993). Mathematics performance assessment: Technical quality and diverse student impact. Journal for Research in Mathematics Education, 24(3), 190-216.
Blank, R. K., Pechman, E. M., & Goldstein, D. (1996). State mathematics and science standards, frameworks, and student assessments: What is the status of development in the 50 states? Washington, DC: Council of Chief State School Officers.
Cohen, D. K. (1990). A revolution in one classroom: The case of Mrs. Oublier. Educational Evaluation and Policy Analysis, 12(3), 327-345.
Consortium for Policy Research in Education. (1991). Putting the pieces together: Systemic school reform. (CPRE Policy Briefs). New Brunswick, NJ: Eagleton Institute of Politics, Rutgers, The State University of New Jersey.
Illinois Academic Standards Project. (1996). Preliminary draft: Illinois academic standards for public review and comment, English language arts and mathematics, Volume 1, State goals 1-10. Springfield, IL: Author.
McKnight, C., Britton, E. D., Valverde, G. A., & Schmidt, W. H. (1992). Survey of mathematics and science opportunities: Document analysis manual (Research report series No. 42). East Lansing, MI: Third International Mathematics and Science Study, Michigan State University.
National Council of Teachers of Mathematics. (1989). Curriculum and evaluation standards for school mathematics. Reston, VA: Author.
National Council of Teachers of Mathematics. (1991). Professional standards for teaching mathematics. Reston, VA: Author.
National Research Council. (1996). National science education standards. Washington, DC: National Academy Press.
Newmann, F. M. (1993). Beyond common sense in educational restructuring: The issues of content and linkage. Educational Researcher, 22(2), 4-13, 22.
Newmann, F. M., Secada, W. G., & Wehlage, G. G. (1995). A guide to authentic instruction and assessment: Vision, standards, and scoring. Madison, WI: Center on Organization and Restructuring of Schools.
Porter, A. C. (1995). Developing opportunity-to-learn indicators of the content of instruction: Progress report. Madison, WI: Wisconsin Center for Education Research.
Roeber, E. D. (1996). Review of the Oregon content and performance standards. A report of the National Standards Review Team prepared for the Oregon Department of Education. Salem, OR: Oregon Department of Education.
Romberg, T. A., & Carpenter, T. P. (1986). Research on teaching and learning mathematics: Two disciplines of scientific inquiry. In M. C. Wittrock (Ed.), Handbook of research on teaching (3rd ed. pp. 850-873). New York: Macmillan.
Romberg, T. A., Zarinnia, E. A., & Williams, S. (1990). Mandated school mathematics testing in the United States: A survey of state mathematics supervisors. Madison, WI: National Center for Research in Mathematical Sciences Education.
Schmidt, W. H., & McKnight, C. (1995, Fall). Surveying educational opportunity in mathematics and science: An international perspective. Educational Evaluation and Policy Analysis, 3, 337-353.
South Carolina Department of Education. (1996). South Carolina science academic achievement standards (draft). Columbia, SC: Author.
Stein, M. K., Grover, B. W., & Henningsen, M. (1996). Building student capacity for mathematical thinking and reasoning: An analysis of mathematical tasks used in reform classrooms. American Educational Research Journal, 33(2), 455-488.
Virginia Board of Education. (1995). Standards of learning for Virginia public schools. Richmond, VA: Author.
NISE Brief Staff