A great top 10 research based list of the problems of using Value added modelling to sort and rank teachers, via Vamboozled .
- VAM estimates should not be used to assess teacher effectiveness. The standardized achievement tests on which VAM estimates are based, have always been, and continue to be, developed to assess levels of student achievement and not levels growth in student achievement nor growth in achievement that can be attributed to teacher effectiveness. The tests on which VAM estimates are based (among other issues) were never designed to estimate teachers’ causal effects.
- VAM estimates are often unreliable. Teachers who should be (more or less) consistently effective are being classified in sometimes highly inconsistent ways over time. A teacher classified as “adding value” has a 25 to 50% chance of being classified as “subtracting value” the following year(s), and vice versa. This sometimes makes the probability of a teacher being identified as effective no different than the flip of a coin.
- VAM estimates are often invalid. Without adequate reliability, as reliability is a qualifying condition for validity, valid VAM-based interpretations are even more difficult to defend. Likewise, very limited evidence exists to support that teachers who post high- or low-value added scores are effective using at least one other correlated criterion (e.g., teacher observational scores, teacher satisfaction surveys). The correlations being demonstrated across studies are not nearly high enough to support valid interpretation or use.
- VAM estimates can be biased. Teachers of certain students who are almost never randomly assigned to classrooms have more difficulties demonstrating value-added than their comparably effective peers. Estimates for teachers who teach inordinate proportions of English Language Learners (ELLs), special education students, students who receive free or reduced lunches, and students retained in grade, are more adversely impacted by bias. While bias can present itself in terms of reliability (e.g., when teachers post consistently high or low levels of value-added over time), the illusion of consistency can sometimes be due, rather, to teachers being consistently assigned more homogenous sets of students.
- Related, VAM estimates are fraught with measurement errors that negate their levels of reliability and validity, and contribute to issues of bias. These errors are caused by inordinate amounts of inaccurate or missing data that cannot be easily replaced or disregarded; variables that cannot be statistically “controlled for;” differential summer learning gains and losses and prior teachers’ residual effects that also cannot be “controlled for;” the effects of teaching in non-traditional, non-isolated, and non-insular classrooms; and the like.
- VAM estimates are unfair. Issues of fairness arise when test-based indicators and their inference-based uses impact some more than others in consequential ways. With VAMs, only teachers of mathematics and reading/language arts with pre and post-test data in certain grade levels (e.g., grades 3-8) are typically being held accountable. Across the nation, this is leaving approximately 60-70% of teachers, including entire campuses of teachers (e.g., early elementary and high school teachers), as VAM-ineligible.
- VAM estimates are non-transparent. Estimates must be made transparent in order to be understood, so that they can ultimately be used to “inform” change and progress in “[in]formative” ways. However, the teachers and administrators who are to use VAM estimates accordingly do not typically understand the VAMs or VAM estimates being used to evaluate them, particularly enough so to promote such change.
- Related, VAM estimates are typically of no informative, formative, or instructional value. No research to date suggests that VAM-use has improved teachers’ instruction or student learning and achievement.
- VAM estimates are being used inappropriately to make consequential decisions. VAM estimates do not have enough consistency, accuracy, or depth to satisfy that which VAMs are increasingly being tasked, for example, to help make high-stakes decisions about whether teachers receive merit pay, are rewarded/denied tenure, or are retained or inversely terminated. While proponents argue that because of VAMs’ imperfections, VAM estimates should not be used in isolation of other indicators, the fact of the matter is that VAMs are so imperfect they should not be used for much of anything unless largely imperfect decisions are desired.
- The unintended consequences of VAM use are continuously going unrecognized, although research suggests they continue to exits. For example, teachers are choosing not to teach certain students, including those who teachers deem as the most likely to hinder their potentials to demonstrate value-added. Principals are stacking classes to make sure certain teachers are more likely to demonstrate “value-added,” or vice versa, to protect or penalize certain teachers, respectively. Teachers are leaving/refusing assignments to grades in which VAM-based estimates matter most, and some teachers are leaving teaching altogether out of discontent or in protest. About the seriousness of these and other unintended consequences, weighed against VAMs’ intended consequences or the lack thereof, proponents and others simply do not seem to give a VAM.