West Coast Stat Views (on Observational Epidemiology and more)

Monday, October 11, 2010

"Who would have thought that Erik Estrada would have the more dignified career?"

Just to start your Monday off on a sufficiently weird note.

Sunday, October 10, 2010

Education Reform

I want to very quickly return to first principles. When Mark and I began discussing tenure reform, it was in the context of a "crisis" in education. This terminology continues to this day.

However, the real impact of recent news is that proposed reforms don’t have the potential to make immediate and dramatic improvements in education outcomes. Why does this matter?

Because if there is an incipient crisis and known strategies can directly address them then it would be grossly unethical not to try and address this in the fastest way possible. However, if there is not an immediate crisis the correct way forward is one that addresses all of the stakeholders and not radical top-down driven reform. In other words, Baltimore and not Washington, DC.

In the long run educational reform may be inevitable and positive. One of our well versed commentators (Stuart Buck) opined about the evidence:

It's consistent with any number of stories, including increased quality of teaching, better curriculum, finding a better fit for each individual students (some do better in a smaller school, for example), and the factors that you mention.

In my view, this suggests that we are going to experiment with news modes of education. After all, many people who I respect are strongly advocating for experimenting further with education reform (Jon Chait, Megan McArdle, Matt Yglesias, Alex Tabarrok come immediately to mind).

So why are there concerns about the process by which educational reform is occurring? Because, the discussion began with a question of where to allocate resources. Seyward Darby was arguing that we needed to accept teacher layoffs as part of the price if educational reform:

The president's beef is with a provision to prevent teacher layoffs, which Democrats tacked onto the bill along with several other domestic priorities. To pay for the measure, the House agreed to cut money from some of the president's key education reform initiatives. Obama isn't happy about it. Nor should he be.

Now, if there is a real and immediate crisis in education than, of course, dramatic measures can make sense. But is this really the time to spark a round of teacher layoffs in order to make slow improvements in who decides to apply for teaching jobs? Maybe, but it seems naive to think that we should fuel the testing of educational reform with layoffs at this precise moment. Readers of Felix Salmon may remember this week's jobs report:

Meanwhile, as the school year begins, we have this:

Employment in local government decreased by 76,000 in September with job losses in both education and noneducation.

As states and municipalities around the nation start running out of money, they’re going to fire people; this is only the beginning. And if October is any indication, the job losses in the local government sector are going to be at least as big as the job gains in the private sector.

So the real issue is whether this is the time for radical teacher employment restructuring -- should we lay off teachers to test educational reform? We do have a duty to the future but we also have a duty to the current students as well. The conversation would be different if the net resources for education were increasing but claiming that education is a priority in the midst of layoffs due to lack of funding seems disingenuous.

My interest in this subject grew from two arguments in the blogosphere. One, that the crisis in educational was so bad that the state should massively break contracts without cause. Notice that in cases like AIG and TARP, we were willing to spend a lot of money as a society to preserve financial contracts. Two, that reform has likely to be so important that teacher lay-offs in the midst of a recession were an acceptable sacrifice as the students would be better off.

If we don't accept that there is an immediate crisis then we can still move forward. But then it becomes an American-style bottom-up reform and not a Soviet-style top down reform. I like the Baltimore example -- specific communities negotiating ways to respond to the crisis and continuing to try ways to create a better future for their children. The result of a thousand experiments with engaged communities could very well result in a far better educational system in the long run.

And I think that is a good outcome.

Saturday, October 9, 2010

"The Secret of Doublets"

I've got a new (OK, newish) site up called "Education and Statistics." The idea is to have a place that's more focused on education and less scary to the general reader (I've noticed people tend to shy away when I tell them I blog at "Observational Epidemiology").

There will be lots of dual posts and reprints, but you'll also find quite a bit of E and S exclusives like this introduction to Lewis Carroll's addictive word game, doublets (a.k.a. word links, word ladders and word golf). Come by and see if you too can evolve APE into MAN

Friday, October 8, 2010

I don't like to go to the well too often, but...

(I know this is two in a row, but how could I let this one go?)

When mortgage bankers engage in strategic default, the cost in PR damage, public backlash and potential regulation far exceed the savings on mortgage payments. When a wealthy lawyer writes self-pitying articles about the difficulty of scraping by on three or four hundred K, he builds support for the very progressive tax policies he opposes. When Wall Street millionaires publicly and angrily insist they were entitled to every penny of support they were given, they make it much more likely that the next time they need assistance it will come at a cost to them.

These people are not stupid nor are they irrational. They do these things because of their worldview.

It is not just that we have a group of people who believe they are entitled to a special set of rules; it is that they have internalized this belief so completely that they no longer see it as a belief. The concept has become as intuitive and self-evident to them as Euclidean geometry. The thought that others might see things differently doesn't occur to them.

This can lead to some embarrassing spectacles, but it does make life easy for Daily Show writers.

The Daily Show With Jon Stewart

Mon - Thurs 11p / 10c

Mortgage Bankers Association Strategic Default

www.thedailyshow.com

Daily Show Full Episodes

Political Humor

Rally to Restore Sanity

Red Half-state, Blue Half-state

The Daily Show With Jon Stewart

Mon - Thurs 11p / 10c

Indecision 2010 - Divided Delaware

www.thedailyshow.com

Daily Show Full Episodes

Political Humor

Rally to Restore Sanity

Thursday, October 7, 2010

Apparently we've reached the goalpost-moving stage of the game

We all knew the case for incentive pay for teachers. The argument was simple and it made a lot of sense: teachers' unions and tenure limited the consequences of poor performance and bad behavior while the lack of bonuses limited the incentives for excellence. You could hardly blame the reformers for pushing it. Who could disagree with the statement that people respond to incentives?

Unfortunately, like so many appealing theories, it didn't do that well in the messiness of the real word. First researchers concluded that the data was too volatile and confounded to identify poor teachers, and now a major study by Vanderbilt and Rand has failed to show anything more than trivial results from incentive pay. Faced with these unpalatable facts, reform supporters have stayed true to their conviction that education (or at least education reform) should be run like a business and have done what so many project managers before them have done: they've moved the goalposts.

Eric A. Hanushek, from the Hoover Institution, assures us that he knew it all along:

"The biggest role of incentives has to do with selection of who enters and who stays in teaching - i.e., how incentives change the teaching corps through entrance and exits," Hanushek said. "I have always thought that the effort effects were small relative to the potential for getting different teachers. Their study has nothing to say about this more important issue."

Jonathan Chait echoed the Hoover line (bet you never thought you'd hear that one):

Of course, the point of performance pay isn't to wring better results out of the same teaching pool. It's to change the composition of the teaching pool. Teachers tend to come from the lower ranks of college graduates. That's natural, because the profession pays poorly compared with other jobs requiring college degree and does not offer financial rewards for success. The idea of merit pay is that you lure into the profession people who want to be treated like professionals -- they run the risk of being fired if they're incompetent, but they can also earn recognition and higher pay for exceptional performance.

It's true that there important secondary selection benefits from a well-designed incentive system, but how credible is the claim that all the reformers were interested in from the beginning were the selection effects, that the Vanderbilt results were unimportant, even, according to Hanushek, expected?

Why did these reform supporters push ahead with high profile research that they believed would prove nothing and would make the movement look bad? This was not a cheap study. In addition to conventional funding, a private donor, presumably a reform movement supporter, put up 1.3 million dollars of his own money to test the hypothesis that incentive pay for teachers would improve student test performance. If "the point of performance pay [wasn't] to wring better results out of the same teaching pool," why waste over a million dollars to see how well performance pay did just that?

For a fraction of that money, you could have funded research that would have directly addressed the question of teacher self-selection by conducting a quick and cheap survey-based study that would look at the correlation between attitudes toward incentive pay and factors like GPA.

And even if we accept the I-meant-to-do-that response, the Vanderbilt study still presents advocates of incentive pay with a huge problem. The assumption behind their theory is that competent, hard-working people will go where competence and hard work are rewarded. Unfortunately, the study indicates that either the incentive metrics are largely out of teachers' control or teachers were generally doing what they could to maximize student performance before bonuses were on the table.

Keep in mind, we're talking about individual bonuses, not the kind you get for an organization meeting some goal. Do we have any evidence, even anecdotal, to show that more desirable employees are attracted to compensation plans with large individual incentive components even when the employees have been shown to have little if any influence over the value of those incentives?

If we were talking about a good, well-designed compensation scheme you could make a strong case for positive selection effects, but we're not even close. We are talking about incentive pay based on hopelessly confounded and volatile data. We are talking about incentive pay based on easily manipulated metrics. We are talking about incentive pay that does not incent.

You really need to read that last one out loud to get the full effect:

We are talking about incentive pay that does not incent.

Just so we're clear, the reform advocates are saying that we take money from things like salary and training and divert it to bonuses that are based on poor-quality data and have not been shown to provide incentive value. We should do this because this poorly-designed compensation scheme will attract a better class of applicant.

And on top of all that, they're asking us to believe this was their plan from the very beginning.

More thoughts on education reform

Dana Goldstein has a nice piece about how education reform can proceed without mass teacher firings. It seems that Baltimore has agreed to a new teacher contract (with the union) that will experiment with many of the reforms that are being proposed. In particular, I like the idea of defining "lead teacher" as being one per school. This nicely side-steps inter-school competition (where the incentive is to try and shift weak students to new schools) and puts the focus on doing one's best with the students they have. Plus, on an individual school basis it is likely that information will be more complete than would be the case when you focus entirely on standardized test scores.

Mark has noted the odd anti-union stance of even liberal education reform advocates. Curiously, I would hypothesize that if reforms are worthwhile and carefully thought out then it is possible to get teachers to agree to them (even if they are not entirely in the best interests of the teachers). After all, it's not a ridiculous idea that many of teachers went into teaching in hopes of helping children to succeed (those focused entirely on financial rewards may well have chosen other lines of work). But perhaps this is a good example of positive reform. I may not like every element of it but it is at least a reasoned attempt to experiment with modern reforms.

Matthew Yglesias buries the lede... deep

Matthew Yglesias again steps up to defend the honor of charter schools with a post on Anders Böhlmark and Mikael Lindahl's paper “Does School Privatization Improve Educational Achievement? Evidence from Sweden’s Voucher Reform” (PDF) from which he concludes:

In effect, Swedish practice is like what exists in American states (Arizona, for example) with lots of charter schools and it’s quite similar to what the Obama administration (and I) are pushing. The big difference is that for-profit operators are allowed to run schools in Sweden, which I’d be for allowing.

There is, however, an asterisk next to the name of the paper. The footnote is easy to miss (you have to click on the 'More>>' button to find it), but it's worth the effort. It reads:

* Their answer? It does in the short-term, but the gains fade. All else being equal I favor more choice, so I’d regard the reform as a good thing but I assume the architects of the reform were hoping for something more.

Wednesday, October 6, 2010

"New 'venture accelerator' coming soon to Michigan"

From Marketplace:

When the pharmaceutical giant Pfizer left Ann Arbor, Mich., more than three years ago, it left behind the equivalent of a small town: A collection of 30 buildings, including science labs, a water plant and drug factory, scattered across nearly 200 acres.
The former research park is on land that had been owned by the University of Michigan, so the university decided to buy it back and turn the facility into a kind of business incubator on steroids.

Listen to the rest of the story here.

Perhaps this is the time for a counter-reformation

Just to review where we stand.

Charter schools

But for all their support and cultural cachet, the majority of the 5,000 or so charter schools nationwide appear to be no better, and in many cases worse, than local public schools when measured by achievement on standardized tests, according to experts citing years of research. Last year one of the most comprehensive studies, by researchers from Stanford University, found that fewer than one-fifth of charter schools nationally offered a better education than comparable local schools, almost half offered an equivalent education and more than a third, 37 percent, were “significantly worse.”

Although “charter schools have become a rallying cry for education reformers,” the report, by the Center for Research on Education Outcomes, warned, “this study reveals in unmistakable terms that, in the aggregate, charter students are not faring as well” as students in traditional schools.

(As I mentioned before, there is reason to believe that this research is biased in favor of charter schools.)

Test-based metrics

For a variety of reasons, analyses of VAM [Value Added Modeling] results have led researchers to doubt whether the methodology can accurately identify more and less effective teachers. VAM estimates have proven to be unstable across statistical models, years, and classes that teachers teach. One study found that across five large urban districts, among teachers who were ranked in the top 20% of effectiveness in the first year, fewer than a third were in that top group the next year, and another third moved all the way down to the bottom 40%. Another found that teachers’ effectiveness ratings in one year could only predict from 4% to 16% of the variation in such ratings in the following year. Thus, a teacher who appears to be very ineffective in one year might have a dramatically different result the following year. The same dramatic fluctuations were found for teachers ranked at the bottom in the first year of analysis. This runs counter to most people’s notions that the true quality of a teacher is likely to change very little over time and raises questions about whether what is measured is largely a “teacher effect” or the effect of a wide variety of other factors.

A study designed to test this question used VAM methods to assign effects to teachers after controlling for other factors, but applied the model backwards to see if credible results were obtained. Surprisingly, it found that students’ fifth grade teachers were good predictors of their fourth grade test scores. Inasmuch as a student’s later fifth grade teacher cannot possibly have influenced that student’s fourth grade performance, this curious result can only mean that VAM results are based on factors other than teachers’ actual effectiveness.

Firing under-performing teachers

If new laws or policies specifically require that teachers be fired if their students’ test scores do not rise by a certain amount, then more teachers might well be terminated than is now the case. But there is not strong evidence to indicate either that the departing teachers would actually be the weakest teachers, or that the departing teachers would be replaced by more effective ones.

Performance pay

The study was conducted by the National Center on Performance Incentives at Vanderbilt. The center, which takes no advocacy position on the issue, was created at the university's highly regarded Peabody College of Education and Human Development in 2006 with a $10 million federal research grant.

In a three-year experiment funded by the federal grant and aided by the Rand Corp., researchers tracked what happened in Nashville schools when math teachers in grades 5 through 8 were offered bonuses of $5,000, $10,000 and $15,000 for hitting annual test-score targets. About 300 teachers volunteered. Researchers randomly assigned half of the participants to a control group ineligible for the bonuses and the other half to an experimental group that could receive bonuses if their students reached certain benchmarks.

Researchers designed the bonuses to be large enough to function as a legitimate incentive for teachers whose average salary, according to a union official, is between $40,000 and $50,000. There were no additional variables in the experiment: no professional development, mentoring or other elements meant to affect test scores. The bonuses, totaling nearly $1.3 million, were funded by businessman Orrin Ingram, according to news reports. A university spokeswoman said Tuesday evening that she could not confirm those reports, and Ingram could not be reached for comment.

On the whole, researchers found no significant difference between the test results from classes led by teachers eligible for bonuses and those led by teachers who were ineligible. Bonuses appeared to have some positive effect in the fifth grade, researchers said, but they discounted that finding in part because the difference faded out when students moved to the sixth grade.

Just for the record, I believe that charter schools, increased use of metrics, merit pay and a streamlined process for dismissing bad teachers do have a place in education, but all of these things can more harm than good if badly implemented and, given the current state of the reform movement, badly implemented is pretty much the upper bound.

Incentive Pay

In a recent opinion piece on performance pay by Robin Chait and Ulrich Boser, the following claim is made:

The problem with our nation's educational system is not that teachers don't care about students or money. Rather the issue is that too many educators don't have the support, tools or proper incentive structure to succeed. In fact, the teachers in the study told the researchers that the prospect of bonuses didn't change their behavior because they were already trying as hard as they could.

Indeed, previous, smaller-scale studies of performance pay for teachers have shown that the reforms do work, but only if teachers receive support and targeted incentives to improve their skills.

If training the the key, then why is performance pay so important? Why not put these large sums (up to $15,000, if I read correctly) into improved training programs?

Reforming the way teachers are paid signals to teachers that their performance matters -- that educators should be treated like other professionals.

I have two quick comments. One, saying that you should accept new terms of employment because it will make you more professional is empty language. It's like arguing that smoking makes you cool (because the other cool kids -- here professionals -- are doing it).

Two, professionals all get performance pay? Professional is a large category and there are a diversity of compensation schemes. But I would be very surprised to hear that medical doctors got performance pay (perhaps they do -- it might be worth looking at these schemes). What about your dentist in private practice? These professionals get their wages from "fee for service" which are is very different matter.

But the most important thing here is that, when the empirical results are negative, people are falling back on ideological assumptions (teachers should have a performance pay structure). That should frame the debate properly.

Hey, Rocky! Watch me pull an activist ruling out of this hat...

Friends and long-time readers of OE know I'm a sucker for the bizarre extended analogy and Barry Friedman and Dahlia Lithwick have pulled off a stunner. It's good enough to make me link to Slate.

Tuesday, October 5, 2010

Buzzwords! Buzzwords! Buzzwords!-- Coping with the Vanderbilt study

From the Washington Post:

The study was conducted by the National Center on Performance Incentives at Vanderbilt. The center, which takes no advocacy position on the issue, was created at the university's highly regarded Peabody College of Education and Human Development in 2006 with a $10 million federal research grant.

In a three-year experiment funded by the federal grant and aided by the Rand Corp., researchers tracked what happened in Nashville schools when math teachers in grades 5 through 8 were offered bonuses of $5,000, $10,000 and $15,000 for hitting annual test-score targets. About 300 teachers volunteered. Researchers randomly assigned half of the participants to a control group ineligible for the bonuses and the other half to an experimental group that could receive bonuses if their students reached certain benchmarks.

Researchers designed the bonuses to be large enough to function as a legitimate incentive for teachers whose average salary, according to a union official, is between $40,000 and $50,000. There were no additional variables in the experiment: no professional development, mentoring or other elements meant to affect test scores. The bonuses, totaling nearly $1.3 million, were funded by businessman Orrin Ingram, according to news reports. A university spokeswoman said Tuesday evening that she could not confirm those reports, and Ingram could not be reached for comment.

On the whole, researchers found no significant difference between the test results from classes led by teachers eligible for bonuses and those led by teachers who were ineligible. Bonuses appeared to have some positive effect in the fifth grade, researchers said, but they discounted that finding in part because the difference faded out when students moved to the sixth grade.

This prompted the following memorable bit of edu-speak from the administration:

"While this is a good study, it only looked at the narrow question of whether more pay motivates teachers to try harder," said Peter Cunningham, assistant U.S. education secretary for communications and outreach. "What we are trying to do is change the culture of teaching by giving all educators the feedback they need to get better while rewarding and incentivizing the best to teach in high-need schools, hard to staff subjects. This study doesn't address that objective."

Definitely a strong showing by Cunningham but he didn't mange to work in the word 'excellence.' That's going to cost him some points

Brief quote from EPI

I have a feeling I'm going to be referring to this a lot:

For a variety of reasons, analyses of VAM [Value Added Modeling] results have led researchers to doubt whether the methodology can accurately identify more and less effective teachers. VAM estimates have proven to be unstable across statistical models, years, and classes that teachers teach. One study found that across five large urban districts, among teachers who were ranked in the top 20% of effectiveness in the first year, fewer than a third were in that top group the next year, and another third moved all the way down to the bottom 40%. Another found that teachers’ effectiveness ratings in one year could only predict from 4% to 16% of the variation in such ratings in the following year. Thus, a teacher who appears to be very ineffective in one year might have a dramatically different result the following year. The same dramatic fluctuations were found for teachers ranked at the bottom in the first year of analysis. This runs counter to most people’s notions that the true quality of a teacher is likely to change very little over time and raises questions about whether what is measured is largely a “teacher effect” or the effect of a wide variety of other factors.

A study designed to test this question used VAM methods to assign effects to teachers after controlling for other factors, but applied the model backwards to see if credible results were obtained. Surprisingly, it found that students’ fifth grade teachers were good predictors of their fourth grade test scores. Inasmuch as a student’s later fifth grade teacher cannot possibly have influenced that student’s fourth grade performance, this curious result can only mean that VAM results are based on factors other than teachers’ actual effectiveness.

An experiment in blogging -- the conclusion

When assessing a statement, sometimes it's useful to rephrase it in a more general way and see how well it holds up. I tried that with a passage I found in a popular blog (one of the very few I read every day). Where the author had referred to members of a specific profession I substituted in the word 'employees' except when talking about unions ('employees unions' seemed redundant). I also changed a couple of words for consistency, but other than that the passage was exactly the same.

The resulting paragraph (seen below) was much more extreme than I had expected and it got me to thinking, how would people react to this passage if they encountered it without all the baggage? I decided to post the generalized version with a brief explanatory note then give people a couple of days to think about it before filing in the details.

Here's the generalized passage:

If you concede that employers need to be able to fire bad employees, then you can't fully defend the role of the unions. You can defend the concept of unions, and you can believe that some of the things unions do, like bargain for higher aggregate wages, help society. But most unions demonstrably make it very difficult to fire bad employees. That is currently a core function of unions, and something that must change. You're also going to need higher salaries to attract a better caliber employee into the workforce, and that's something unions could potentially help. But being "treated like professionals" has to mean both the opportunity to earn a good living if you do well and the potential to be fired if you fail.

And here is the passage Jonathan Chait (that's right, Jonathan Chait) originally posted in his blog:

If you concede that principals need to be able to fire bad teachers, then you can't fully defend the role of the unions. You can defend the concept of unions, and you can believe that some of the things unions do, like bargain for higher aggregate wages, help education. But most teachers unions demonstrably make it very difficult to fire bad teachers. That is currently a core function of teachers unions, and something that must change. You're also going to need higher salaries to attract a better caliber teacher into the profession, and that's something unions could potentially help. But being "treated like professionals" has to mean both the opportunity to earn a good living if you do well and the potential to be fired if you fail.

There are obviously two possible responses Chait could make here (three if you count ignoring it entirely). He could say he agrees with the general statement or he could argue that teachers are a special case and should be granted less union protection than, say, policemen.*

Ironically, the more defensible position Chait can take here is the extreme one, namely that unions should not do anything to discourage employers from firing their members. It's not a position that most readers of the New Republic would embrace but, as a statement of personal belief, it is extraordinarily difficult to rebut.

If he tries to explain why teachers constitute a special case, he will have to deal with the data and in this particular debate, the numbers are not his friends. (It's worth remembering that Diane Ravitch started out on Chait's side. Her road to Damascus came when she realized she could no longer reconcile those views with what she was seeing in the research findings.)

Jonathan Chait can be a formidable debater but he has shown himself to be largely ignorant of the research behind these issues (no one at TNR even knew enough about PISA to catch the bait and switch in the intro to Waiting for Superman and in the education debate that's about as slow as the pitches get).

He'll be trying to punch holes in the findings of institutions like EPI and Rand and big guns in the field like Donald Rubin. He'll have to show precipitous educational decline without resorting to the aforementioned PISA (good test but absolutely meaningless in this context). He'll have to explain why schools that use his policies are more likely to underperform than to outperform unionized schools. He'll have to justify firing people based on metrics so volatile that a third of teachers in the top 20% could find themselves in firing range the next year, metrics based on data so confounded that "students’ fifth grade teachers were good predictors of their fourth grade test scores."

This is one time the smart money is on the other guys.

* Yes, we fire policemen. What we don't do is is fire policemen based on unreliable metrics that are largely outside of the officers' control and are easily manipulated by their superiors