Making Classroom Assessments Reliable and Valid. Robert J. Marzano

Чтение книги онлайн.

Читать онлайн книгу Making Classroom Assessments Reliable and Valid - Robert J. Marzano страница 8

Making Classroom Assessments Reliable and Valid - Robert J. Marzano

Скачать книгу

is foundational to demonstrating proficiency. Additionally, it is important to articulate what a student needs to know and do to demonstrate competence beyond the target level of proficiency. To illustrate, consider the following topic that might be the target for third-grade science.

      Students will understand how magnetic forces can affect two objects not in contact with one another.

      To make this topic clear enough that teachers can design multiple assessments that are basically the same in terms of the content and its levels of difficulty, it is necessary to expand this to a level of detail depicted in table I.3, which provides three levels of content for the topic. The target level clearly describes what students must do to demonstrate proficiency. The basic level identifies important, directly taught vocabulary and basic processes. Finally, the advanced level describes a task that demonstrates students’ ability to apply the target content.

Level of Content Content
Advanced Students will design a device that uses magnets to solve a problem. For example, students will be asked to identify a problem that could be solved using the attracting and repelling qualities of magnets, and create a prototype of design.
Target Students will learn how magnetic forces can affect two objects not in contact with one another. For example, students will determine how magnets interact with other objects (including different and similar poles of other magnets), and experiment with variables that affect these interactions (such as orientation of magnets and distance between material or objects).
Basic Students will recognize or recall specific vocabulary, such as attraction, bar magnet, horseshoe magnet, magnetic field, magnetic, nonmagnetic, north pole, or south pole. Students will perform basic processes, such as: • Explain that magnets create areas of magnetic force around them • Explain that magnets always have north and south poles • Provide example of magnetic and nonmagnetic materials • Explain how two opposite poles interact (attracting) and two opposite poles interact (repelling) • Identify variables that affect strength of magnetic force (for example, distance between objects, or size)

      Source: Adapted from Simms, 2016.

      The teacher now has three levels of content, all on the same topic, that provide specific directions on how to create classroom assessments on the same topic and the same levels of difficulty. I discuss how classroom teachers can do this in chapter 2 (page 39).

      Teachers and administrators for grades K–12 will learn how to revamp the concepts of validity and reliability so they match the technical advances made in CA, instead of matching large-scale assessment’s traditional paradigms for validity and reliability. This introduction lays the foundation. It introduces the new validity and reliability paradigms constructed for CAs. Chapters 15 describe these paradigms in detail. Chapter 1 covers the new CA paradigm for validity, noting the qualities of three major types of validity and two perspectives teachers can take regarding classroom assessments. Chapter 2 then conveys the variety of CAs that teachers can use to construct parallel assessments, which measure students’ individual growth. Chapter 3 addresses the new CA paradigm for reliability and how it shifts from the traditional conception of reliability; it presents three mathematical models of reliability. Then, chapter 4 expresses how to measure groups of students’ comparative growth and what purposes this serves. Finally, chapter 5 considers helpful changes to report cards and teacher evaluations based on the new paradigms for CAs. The appendix features formulas that teachers, schools, and districts can use to compute the reliability of CAs in a manner that is comparable to the level of precision offered by large-scale assessments.

      chapter 1

      Discussing the Classroom Assessment Paradigm for Validity

      Validity is certainly the first order of business when researchers or educators design CAs. The concept of validity has evolved over the years into a multifaceted construct. As mentioned previously, the initial conception of a test’s validity was that it measures what it purports to measure. As Henry E. Garrett (1937) notes, “the fidelity with which [a test] measures what it purports to measure” (p. 324) is the hallmark of its validity. By the 1950s, though, important distinctions emerged about the nature and function of validity. Samuel Messick (1993) explains that since the early 1950s, validity has been thought of as involving three major types: (1) criterion-related validity, (2) construct validity, and (3) content validity.

      While the three types of validity have unique qualities, these distinctions are made more complex by virtue of the fact that one can examine validity from two perspectives. John D. Hathcoat (2013) explains that these perspectives are (1) the instrumental perspective and (2) the argument-based perspective. Validity in general—and the three different types in particular—look quite different depending on the perspective. This is a central theme of this chapter, and I make a case for the argument-based perspective as superior, particularly as it relates to CAs. The chapter also covers the following topics.

      ■ Standards as the basis of CA validity

      ■ Dimensionality

      ■ Measurement topics and proficiency scales

      ■ The rise of learning progressions

      ■ The structure of proficiency scales

      ■ The school’s role in criterion-related validity

      ■ The nature of parallel assessments

      ■ The measurement process

      I begin by discussing the instrumental perspective and its treatment of the three types of validity.

      The instrumental perspective focuses on the test itself. According to Hathcoat (2013), this has been the traditional perspective in measurement theory: a specific test is deemed valid to one degree or another. All three types of validity, then, are considered aspects of a specific test that has been or is being developed within the instrumental perspective. A test possesses certain degrees of the three types of validity.

      For quite some time, measurement experts have warned that the instrumental perspective invites misinterpretations of assessments. For example, in his article “Measurement 101: Some Fundamentals Revisited,” Frisbie (2005) provides concrete examples of the dangers of a literal adherence to an instrumental perspective. About validity, he notes, “Validity is not about instruments themselves, but it is about score interpretations and uses” (p. 22). In effect, Frisbie notes that it is technically inaccurate to refer to the validity of a particular test. Instead, discussion should focus on the valid use or interpretation of the scores from a particular test. To illustrate the

Скачать книгу