Highlights:

A teacher’s value-added score in one year is partially but not fully predictive of her performance in the next.
Value-added is unstable because true teacher performance varies and because value-added measures are subject to error.
Two years of data does a meaningfully better job at predicting value added than does just one. A teacher’s value added in one subject is only partially predictive of her value added in another, and a teacher’s value added for one group of students is only partially predictive of her valued added for others.
The variation of a teacher’s value added across time, subject, and student population depends in part on the model with which it is measured and the source of the data that is used.
Year-to-year instability suggests caution when using value-added measures to make decisions for which there are no mechanisms for re-evaluation and no other sources of information.

Introduction

Value-added models measure teacher performance by the test score gains of their students, adjusted for a variety factors such as the performance of students when they enter the class. The measures are based on desired student outcomes such as math and reading scores, but they have a number of potential drawbacks. One of them is the inconsistency in estimates for the same teacher when value added is measured in a different year, or for different subjects, or for different groups of students.

Some of the differences in value added from year to year result from true differences in a teacher’s performance. Differences can also arise from classroom peer effects; the students themselves contribute to the quality of classroom life, and this contribution changes from year to year. Other differences come from the tests on which the value-added measures are based; because test scores are not perfectly accurate measures of student knowledge, it follows that they are not perfectly accurate gauges of teacher performance.

In this brief, we describe how value-added measures for individual teachers vary across time, subject, and student populations. We discuss how additional research could help educators use these measures more effectively, and we pose new questions, the answers to which depend not on empirical investigation but on human judgment. Finally, we consider how the current body of knowledge, and the gaps in that knowledge, can guide decisions about how to use value-added measures in evaluations of teacher effectiveness.

[readon2 url="http://www.carnegieknowledgenetwork.org/briefs/value-added/value-added-stability/"]Continue reading...[/readon2]

The National Assessment of Educational Progress (NAEP) has just published their report "Mapping State Proficiency Standards Onto NAEP Scales: Variation and Change in State Standards for Reading and Mathematics, 2005-2009"

This research looked at the following issues

How do states’ 2009 standards for proficient performance compare with one another when mapped onto the NAEP scale? There is wide variation among state proficiency standards.
Most states’ proficiency standards are at or below NAEP’s definition of Basic performance.

How do the 2009 NAEP scale equivalents of state standards compare with those estimated for 2007 and 2005? For those states that made substantive changes in their assessments between 2007 and 2009 most moved toward more rigorous standards as measured by NAEP.
For those states that made substantive changes in their assessments between 2005 and 2009, changes in the rigor of states’ standards as measured by NAEP were mixed but showed more decreases than increases in the rigor of their standards.

Does NAEP corroborate a state’s changes in the proportion of students meeting the state’s standard for proficiency from 2007 to 2009? From 2005 to 2009? Changes in the proportion of students meeting states’ standards for proficiency between 2007 and 2009 are not corroborated by the proportion of students meeting proficiency, as measured by NAEP, in at least half of the states in the comparison sample.
Results of comparisons between changes in the proportion of students meeting states’ standards for proficiency between 2005 and 2009 and the proportion of students meeting proficiency, as measured by NAEP, were mixed.

The full report can be found here (PDF). We've pulled out some of the graphs that show Ohio's performance vs the rest of the country for each of the 4th and 8th grade reading and math achievement levels.

4th grade reading

8th grade reading

4th grade math

8th grade math