As more and more schools implement various forms of Value-Added method (VAM) evaluation systems, we are learning some disturbing things about how reliable these methods are.
Education Week's Stephan Sawchuk, in "'Value-Added' Measures at Secondary Level Questioned," explains that value-added statistical modeling was once limited to analyzing large sets of data. These statistical models projected students' test score growth, based on their past performance, and thus estimated a growth target. But, now 30 states require teacher evaluations to use student performance, and that has expanded use of algorithms for high-stakes purposes. Value-added estimates are now being applied to secondary schools, even though the vast majority of research on their use has been limited to elementary schools.
Sawchuk reports on two major studies that should slow this rush to evaluate all teachers with experimental models. This month, Douglas Harris will be presenting "Bias of Public Sector Worker Performance Monitoring." It is based on a six years of Florida middle school data on 1.3 million math students.
Harris divides classes into three types, remedial, midlevel, and advanced. After controlling for tracking, he finds that between 30 to 70% of teachers would be placed in the wrong category by normative value-added models. Moreover, Harris discovers that teachers who taught more remedial classes tended to have lower value-added scores than teachers who taught mainly higher-level classes. "That phenomenon was not due to the best teachers' disproportionately teaching the more-rigorous classes, as is often asserted. Instead, the paper shows, even those teachers who taught courses at more than one level of rigor did better when their performance teaching the upper-level classes was compared against that from the lower-level classes."
[readon2 url="http://blogs.edweek.org/teachers/living-in-dialogue/2012/11/john_thompson_new_research_unc.html"]Continue reading...[/readon2]