Article

Do Different Value-Added Models Tell Us the Same Things?

Via

Highlights

  • Statistical models that evaluate teachers based on growth in student achievement differ in how they account for student backgrounds, school, and classroom resources. They also differ by whether they compare teachers across a district (or state) or just within schools.
  • Statistical models that do not account for student background factors produce estimates of teacher quality that are highly correlated with estimates from value-added models that do control for student backgrounds, as long as each includes a measure of prior student achievement.
  • Even when correlations between models are high, different models will categorize many teachers differently.
  • Teachers of advantaged students benefit from models that do not control for student background factors, while teachers of disadvantaged students benefit from models that do.
  • The type of teacher comparisons, whether within or between schools, generally has a larger effect on teacher rankings than statistical adjustments for differences in student backgrounds across classrooms.

Introduction

There are good reasons for re-thinking teacher evaluation. As we know, evaluation systems in most school districts appear to be far from rigorous. A recent study showed that more than 99 percent of teachers in a number of districts were rated “satisfactory,” which does not comport with empirical evidence that teachers differ substantially from each other in terms of their effectiveness. Likewise, the ratings do not reflect the assessment of the teacher workforce by administrators, other teachers, or students.

Evaluation systems that fail to recognize the true differences that we know exist among teachers greatly hamper the ability of school leaders and policymakers to make informed decisions about such matters as which teachers to hire, what teachers to help, which teachers to promote, and which teachers to dismiss. Thus it is encouraging that policymakers are developing more rigorous evaluation systems, many of which are partly based on student test scores.

Yet while the idea of using student test scores for teacher evaluations may be conceptually appealing, there is no universally accepted methodology for translating student growth into a measure of teacher performance. In this brief, we review what is known about how measures that use student growth align with one another, and what that agreement or disagreement might mean for policy.

[readon2 url="http://www.carnegieknowledgenetwork.org/briefs/value-added/different-growth-models/"]Continue reading...[/readon2]

How Stable are Value-Added Estimates

Via

Highlights:

  • A teacher’s value-added score in one year is partially but not fully predictive of her performance in the next.
  • Value-added is unstable because true teacher performance varies and because value-added measures are subject to error.
  • Two years of data does a meaningfully better job at predicting value added than does just one. A teacher’s value added in one subject is only partially predictive of her value added in another, and a teacher’s value added for one group of students is only partially predictive of her valued added for others.
  • The variation of a teacher’s value added across time, subject, and student population depends in part on the model with which it is measured and the source of the data that is used.
  • Year-to-year instability suggests caution when using value-added measures to make decisions for which there are no mechanisms for re-evaluation and no other sources of information.

Introduction

Value-added models measure teacher performance by the test score gains of their students, adjusted for a variety factors such as the performance of students when they enter the class. The measures are based on desired student outcomes such as math and reading scores, but they have a number of potential drawbacks. One of them is the inconsistency in estimates for the same teacher when value added is measured in a different year, or for different subjects, or for different groups of students.

Some of the differences in value added from year to year result from true differences in a teacher’s performance. Differences can also arise from classroom peer effects; the students themselves contribute to the quality of classroom life, and this contribution changes from year to year. Other differences come from the tests on which the value-added measures are based; because test scores are not perfectly accurate measures of student knowledge, it follows that they are not perfectly accurate gauges of teacher performance.

In this brief, we describe how value-added measures for individual teachers vary across time, subject, and student populations. We discuss how additional research could help educators use these measures more effectively, and we pose new questions, the answers to which depend not on empirical investigation but on human judgment. Finally, we consider how the current body of knowledge, and the gaps in that knowledge, can guide decisions about how to use value-added measures in evaluations of teacher effectiveness.

[readon2 url="http://www.carnegieknowledgenetwork.org/briefs/value-added/value-added-stability/"]Continue reading...[/readon2]

Do Value-Added Methods Level the Playing Field for Teachers?

Via

Highlights

  • Value-added measures partially level the playing field by controlling for many student characteristics. But if they don't fully adjust for all the factors that influence achievement and that consistently differ among classrooms, they may be distorted, or confounded (An estimate of a teacher’s effect is said to be confounded when her contribution cannot be separated from other factors outside of her control, namely the the students in her classroom.)
  • Simple value-added models that control for just a few tests scores (or only one score) and no other variables produce measures that underestimate teachers with low-achieving students and overestimate teachers with high-achieving students.
  • The evidence, while inconclusive, generally suggests that confounding is weak. But it would not be prudent to conclude that confounding is not a problem for all teachers. In particular, the evidence on comparing teachers across schools is limited.
  • Studies assess general patterns of confounding. They do not examine confounding for individual teachers, and they can't rule out the possibility that some teachers consistently teach students who are distinct enough to cause confounding.
  • Value-added models often control for variables such as average prior achievement for a classroom or school, but this practice could introduce errors into value-added estimates.
  • Confounding might lead school systems to draw erroneous conclusions about their teachers – conclusions that carry heavy costs to both teachers and society.

Introduction

Value-added models have caught the interest of policymakers because, unlike using student tests scores for other means of accountability, they purport to "level the playing field." That is, they supposedly reflect only a teacher's effectiveness, not whether she teaches high- or low-income students, for instance, or students in accelerated or standard classes. Yet many people are concerned that teacher effects from value-added measures will be sensitive to the characteristics of her students. More specifically, they believe that teachers of low-income, minority, or special education students will have lower value-added scores than equally effective teachers who are teaching students outside these populations. Other people worry that the opposite might be true - that some value-added models might cause teachers of low-income, minority, or special education students to have higher value-added scores than equally effective teachers who work with higher-achieving, less risky populations.

In this brief, we discuss what is and is not known about how well value-added measures level the playing field for teachers by controlling for student characteristics. We first discuss the results of empirical explorations. We then address outstanding questions and the challenges to answering them with empirical data. Finally, we discuss the implications of these findings for teacher evaluations and the actions that may be based on them.

[readon2 url="http://www.carnegieknowledgenetwork.org/briefs/value-added/level-playing-field/"]Continue reading...[/readon2]

Stop blaming teachers

The Merriam-Webster dictionary defines scapegoat as one that bears the blame for others, or one that is the object of irrational hostility. Those of us in the education profession would define scapegoat this way: teacher.

Scapegoating teachers has become so popular with policymakers and politicians, the media, and even members of the public that it has blurred the reality of what’s really happening in education. What’s more, it’s eroding a noble profession and wreaking havoc on student learning, says Kevin Kumashiro, author of Bad Teacher!: How Blaming Teachers Distorts the Bigger Picture.

In his book, Kumashiro, president of the National Association for Multicultural Education and professor of Asian American Studies and Education at the University of Illinois at Chicago, explains how scapegoating public-school teachers, teacher unions, and teacher education masks the real, systemic problems in education. He also demonstrates how trends like market-based reforms and fast-track teacher certification programs create obstacles to an equitable education for all children.

[readon2 url="http://neatoday.org/2012/11/26/stop-blaming-teachers/"]Continue reading...[/readon2]

How Should Educators Interpret Value-Added Scores?

Via

Highlights

  • Each teacher, in principle, possesses one true value-added score each year, but we never see that "true" score. Instead, we see a single estimate within a range of plausible scores.
  • The range of plausible value-added scores -; the confidence interval -; can overlap considerably for many teachers. Consequently, for many teachers we cannot readily distinguish between them with respect to their true value-added scores.
  • Two conditions would enable us to achieve value-added estimates with high reliability: first, if teachers' value-added measurements were more precise, and second, if teachers’ true value-added scores varied more dramatically than they do.
  • Two kinds of errors of interpretation are possible when classifying teachers based on value-added: a) “false identifications” of teachers who are actually above a certain percentile but who are mistakenly classified as below it; and b) “false non-identifications” of teachers who are actually below a certain percentile but who are classified as above it. Falsely identifying teachers as being below a threshold poses risk to teachers, but failing to identify teachers who are truly ineffective poses risks to students.
  • Districts can conduct a procedure to identify how uncertainty about true value-added scores contributes to potential errors of classification. First, specify the group of teachers you wish to identify. Then, specify the fraction of false identifications you are willing to tolerate. Finally, specify the likely correlation between value-added score this year and next year. In most real-world settings, the degree of uncertainty will lead to considerable rates of misclassification of teachers.

Introduction

A teacher's value-added score is intended to convey how much that teacher has contributed to student learning in a particular subject in a particular year. Different school districts define and compute value-added scores in different ways. But all of them share the idea that teachers who are particularly successful will help their students make large learning gains, that these gains can be measured by students' performance on achievement tests, and that the value-added score isolates the teacher's contribution to these gains.

A variety of people may see value-added estimates, and each group may use them for different purposes. Teachers themselves may want to compare their scores with those of others and use them to improve their work. Administrators may use them to make decisions about teaching assignments, professional development, pay, or promotion. Parents, if they see the scores, may use them to request particular teachers for their children. And, finally, researchers may use the estimates for studies on improving instruction.

Using value-added scores in any of these ways can be controversial. Some people doubt the validity of the achievement tests on which the scores are based, some question the emphasis on test scores to begin with, and others challenge the very idea that student learning gains reflect how well teachers do their jobs.

In order to sensibly interpret value-added scores, it is important to do two things: understand the sources of uncertainty and quantify its extent.

Our purpose is not to settle these controversies, but, rather, to answer a more limited, but essential, question: How might educators reasonably interpret value-added scores? Social science has yet to come up with a perfect measure of teacher effectiveness, so anyone who makes decisions on the basis of value-added estimates will be doing so in the midst of uncertainty. Making choices in the face of doubt is hardly unusual – we routinely contend with projected weather forecasts, financial predictions, medical diagnoses, and election polls. But as in these other areas, in order to sensibly interpret value-added scores, it is important to do two things: understand the sources of uncertainty and quantify its extent. Our aim is to identify possible errors of interpretation, to consider how likely these errors are to arise, and to help educators assess how consequential they are for different decisions.

We'll begin by asking how value-added scores are defined and computed. Next, we'll consider two sources of error: statistical bias and statistical imprecision.

[readon2 url="http://www.carnegieknowledgenetwork.org/briefs/value-added/interpreting-value-added/"]Continue reading...[/readon2]

What I’ve learned so far

A guest post by Robert Barkley, Jr.

What I’ve learned so far – as of November 19, 2012

In February of 1958 I began student teaching in a small rural Pennsylvania town. Approximately one month into that experience my master teacher was drafted into the military. And since there were no other teachers in my field in that small district, I was simply asked to complete the school year as the regular teacher.

From that day on I have been immersed in public education at many levels, in several states – even in Canada and with some international contacts, as well as from many vantage points. So some 54 and a half years later, here’s what I have learned so far.

  1. There will be no significant change in education until and unless our society truly and deeply adopts a sense of community attitude. And a sense of community is first and foremost based upon an acceptance that we all belong together – regardless of wealth, race, gender, etc.
  2. The views of amateurs, otherwise known as politicians and private sector moneyed interests, while they may be genuine and well intentioned, are, at best, less than helpful if unrestrained by the views of the professionals working at ground level. Put another way, the view from 30,000 feet may give a broad sense of how the system looks, but the view from street level gives a sense of how the system actually works. Neither is wrong, but both are inadequate by themselves.
  3. Moneyed interests such as test and textbook manufactures and charter school enthusiasts will destroy general education for they have little commitment to the general welfare and common good
  4. No institution or organization will excel until and unless it adopts at all levels a shared sense of purpose – a central aim if you will, and agrees upon how progress toward that purpose will be measured over time. Education is no different.
  5. At the basic levels all education must begin with the recognition and nurturing of the natural curiosity and the current reality of each student.
  6. Teaching is a team sport. In other words, the structure and general practice in schools of teachers operating as independent sources of instruction is flawed. Anything that exacerbates this flawed structure, such as test score ratings of individual teachers and/or individual performance pay schemes, will be harmful and counterproductive.
  7. The separation of knowledge into separate disciplines may be convenient to organizing instruction but it is counter to the construction of learning. Therefore, integrated curriculum strategies are essential if neuroscience is to be appreciated and taken into account.
  8. School employee unions can be useful or problematic to educational progress. Which they become is dependent upon their full inclusion in determining the structure and purpose of education. The more they are pushed to the sidelines, the more their focus will be narrow and self-serving.

Robert Barkley, Jr., is retired Executive Director of the Ohio Education Association, a thirty-five year veteran of NEA and NEA affiliate staff work. He is the author of Quality in Education: A Primer for Collaborative Visionary Educational Leaders, Leadership In Education: A Handbook for School Superintendents and Teacher Union Presidents, and Lessons for a New Reality: Guidance for Superintendent/Teacher Organization Collaboration. He may be reached at rbarkle@columbus.rr.com.