“States have implemented teacher evaluations in their race to avoid the impossible demands of NCLB.” W. James Popham, Evaluating America’s Teachers Mission Possible?Getting teacher evaluation right should be a priority, and in my opinion, a higher priority than any other educational initiative on the table right now. How we evaluate teachers will impact more of what happens in our classrooms than any other reform measure we implement. For example, if we make test scores a large part of this evaluation, then we can expect much of what happens instructionally is going to directed toward improving test scores. If we make technology a priority, we can expect to see more teachers engaging in its use in the classroom. And, if getting students working collaboratively is emphasized, then expect to see teaming as a big part of classroom practice. What get's evaluated is what gets done, period! As Popham points out, "A teacher-appraisal system that inclines teachers to make good instructional decisions is likely to do just that. Conversely, a state teacher-appraisal that points teachers in unsound instructional directions, will, unfortunately, also just do that." What we evaluate is going to be what we get instructionally.
With the importance of what to include in our teacher evaluations in mind, getting teacher evaluation right should not be a hurried process to satisfy federal education policy, though that is what has happened in many states. In order to get waivers from the impossible and ludicrous demands of federal No Child Left Behind legislation, states have been hurriedly putting together teacher evaluations that include using test scores in some manner. This express route to using test scores to determine teacher quality should be frightening to any educator and our parents because of the likelihood that how students do on tests will become the center of what we do in our classrooms. In a word, our schools, every single one of them, become massive "test-prep centers." Let's just hope those evaluations encourage sound instructional decisions and not unsound decisions. Otherwise, we could end up with an education system much more worse off than what we have.
What then are some major mistakes states could make in this rush to implement federally mandated teacher evaluations? In his book Evaluating America's Teachers Mission Possible? W. James Popham describes what he calls "Four Teacher Evaluation Implementation Mistakes" that is, perhaps, a good start for critiquing these state teacher evaluation schemes.
Mistake 1: Using Inappropriate Evidence of a Teacher's Quality
According to Popham, implementation mistake one is simply using "poorly chosen evidence" to determine teacher effectiveness. Evidence is poorly chosen when it does not accurately or in a valid manner tell us anything about a teacher's quality. For example, an observation by a poorly or untrained classroom observer may not provide us with appropriate evidence to determine a teacher's quality. Also, using state achievement tests may also be poorly chosen evidence as well. As Popham clearly points out, "There is almost zero evidence that these state accountability tests yield data permitting valid inferences about a teacher's instructional quality." So even standardized test scores could be inappropriate evidence if one can't make a valid inference about teacher quality from those tests. With this mistake in mind, it is vital that administrators and teachers carefully scrutinize what states choose as evidence in their teacher evaluations.
Mistake 2: Improperly Weighting Evidence of a Teacher's Quality
Improperly weighting evidence involves assigning improper weights to the various forms of evidence used in teacher evaluations. The different sources of evidence most often used include: 1) student test performances, 2) administrator ratings of teachers' skills, 3) classroom observations, and 4) parent or student ratings of teachers. As Popham points out, "a weighting mistake occurs when a given source of evidence is given either far greater, or far lesser, evaluative importance than it should be given." In other words, this error occurs when states place too much or not enough emphasis on one source of evidence. For example, some states weigh test scores as 50 percent in their evaluation schemes. Such heavy weighing of test scores is improper, if those tests do not really allow one to make any valid inferences about teacher quality. Properly weighting the evidence means doing so in a reasonable and fair manner. It is important for teachers and administrators to also scrutinize the weighting scheme states apply to teacher evaluations as well.
Mistake 3: Failing to Adjust Evaluative Weights of Evidence for a Particular Teacher's Instructional Setting
As Popham points out, "To evaluate teachers as though they were operating in identical instructional settings is naive." Using a "cookie-cutter" evaluation system that fails to take into account the unique qualities of that teacher's instructional setting forces standardization when classrooms are far from being standardized. Doing so will not make them standardized either. When a teacher evaluation system weights all evidence without taking into account a teacher's instructional setting, you are going to force that teacher to do the "hoop-jumping" dance, just to satisfy the evaluation. Also, classrooms are as diverse as the teachers teaching them. Even in the unreal world of 20th century, factory designed instructional settings were not standardized, though some nostalgically think so. To evaluate teachers as if they were all teaching under the same conditions shows pure ignorance of what goes on in classrooms. They are highly, complex and diverse environments, and to think one can evaluate them the same fails to take into account this diversity. Teachers and administrators would do well to scrutinize the evaluative weights states place on various forms of evidence and demand that those weighting systems accommodate the diversity of instructional settings that exist too.
Mistake 4: Confusing the Roles of Formative and Summative Teacher Evaluation
As Popham points out, "Formatively, we want to improve teachers' prowess so they can do their most effective job in helping students learn. Summatively, we want to identify the exceptional teachers who should be rewarded as well as those teachers who, if they cannot be helped, should be relieved of their teaching responsibilities." Mistake four involves combining these two functions of teacher evaluation. As Popham points out, these two functions of teacher evaluation can conflict with each other, making neither effective. In addition, isn't it really unrealistic to expect a teacher to open up and be candid and reflective about their performance with the person who holds their future employment in his or her hands? Perhaps it us time for teachers and administrators to advocate for separating the formative and summative functions of teacher evaluation so that both work more effectively and can do their jobs of improving teaching.
Getting teacher evaluation right should be a high priority, because how teachers are evaluated is going to directly impact how instruction is carried out in the classroom. Unfortunately, states, in my opinion, including my own state North Carolina, have rushed to try to satisfy federal mandates for that evaluation system, and I fear the negative impact on education and the teaching profession in North Carolina, and around the country, is going to be felt for years.