West Coast Stat Views (on Observational Epidemiology and more)

Tuesday, September 28, 2010

Encouraging/Discouraging words from President Obama

(Time for another trip into N-Space -- test/retention axis)

From the President's NBC interview:

What is your message to the leadership of unions and to teacher union members?
"We want to work with you; we're not interested in imposing changes on you...you can't defend the status quo in which a third of our kids are dropping out...when you've got 2,000 schools across the country that are drop-out factories, in those schools you have to have radical change. ... The vast majority of teachers want to do a good job, they're not in it for the money. ... Ultimately if some teachers are not doing a good job they've got to go."

This is encouraging because, unlike the test-scores scare, the drop-out crisis hasn't received nearly enough attention. After gangs and school violence, it is the problem that worries me the most.

This discouraging because many of the schools being held up as models by the reformers have built much of their reputation by systematically excluding and/or chasing away the very students who drop out of traditional schools.

The wonders of shifting rationales

(What a great week to be an education blogger)

The old case for sweeping reforms: The changes we're suggesting are drastic and costly with bad track records and the potential to severely damage the system they are meant to save, but the current state of education is so bad that the future of our country depends on implementing radical reforms.

The new case for sweeping reform: it turns out the problem is not that big or wide-reaching but it's good that people mistakenly thought it was so bad because that encourages us to make all these changes.

(h/t Matthew Yglesias)

Monday, September 27, 2010

Does Seyward Darby think Seyward Darby should be fired?

I don't. I have always believed that firing is a last resort, but Darby has certainly gone on the record as being for firing incompetent performers even when the metrics for measuring competence are unreliable and the firings would cause severe damage to the economy. Given the quality of her reporting on education, it's difficult to believe she's not currently lobbying TNR to get herself fired for things like this:

But the real star of the show was Waiting for Superman, the much-hyped documentary about school reform that opens nationwide this week. Gregory started the program with a clip from the movie that shows how poorly we rank, education-wise, against other developed countries.

Does the clip really show how poorly we're doing? Let's roll the tape:

Since the 1970s, U.S. schools have failed to keep pace with the rest of the world. Among 30 developed countries, we ranked 25th in math and 21st in science. The top 5 percent of our students, our very best, ranked 23rd out of 29 developed countries. In almost every category, we've fallen behind.

If you've been following OE, you recognize these oft-quoted numbers as coming from the PISA test which can't possibly support the first sentence since it was first administered in 2000.

It would seem that Seyward Darby doesn't know that.

She also doesn't seem to know that the older, better-established TIMSS test has us doing fairly well internationally. Nor is she apparently aware that using PISA to argue for the standard slate of reforms is problematic since at least one of the highest scoring countries (Canada) has adopted pretty much the opposite approach.

We can let David Gregory off with a warning -- this isn't his beat -- but Darby is the education specialist for one of America's best and most respected publications. There's no way for the New Republic to justify keeping an education reporter who can't spot obvious distortions involving one of the two best known measures of international academic performance.

I would suggest immediate reassignment. If Darby would like to argue for her own dismissal, I'd be happy to debate the issue.

[update: you can find some more thoughts on TNR's education reporting here.]

Perceived vs. Actual Income Distribution

James Fallows points out an interesting study (h/t to Jonathan Chait):

The context is the previous discussion, here and here, about the capacity for feeling short-changed and ill-treated, even among some of the most materially-fortunate people ever to live on Earth. No doubt it's a primal human trait, but for various reasons (as explained here) the ever-polarizing distribution of wealth and income in America has allowed more people to feel bad about their own situation by looking at the handful who are stratospherically better off.

To some extent this is an "information" problem: people don't know where they really stand. A creative way to demonstrate that is with a forthcoming paper by Michael Norton of Harvard Business School and Daniel Ariely of Duke, which compares: (a) how wealth actually is distributed in America; (b) how people think it's distributed; and (c) how they think it should be distributed. The paper is available in PDF here.

The chart below conveys the central point: people think the distribution of wealth is more equal than it actually is; and they think it should be much more equal than their already unrealistically-equal notion of its current state. Eg: the top 20% of the US wealth distribution actually controls nearly 85% of total wealth; people think the top 20% controls under 60%; and they think it should control just over 30%

Similarly: people feel that the bottom 20% of the economic pyramid "should" have about 10% of the total pie; they think it actually has about 3% or 4%; in fact, its share appears to be too small to show up on the chart.

Today's must read on education

Nicholas Lehmann writing for the New Yorker:

There have been attempts in the past to make the system more rational and less redundant, and to shrink the portion of it that undertakes scholarly research, but they have not met with much success, and not just because of bureaucratic resistance by the interested parties. Large-scale, decentralized democratic societies are not very adept at generating neat, rational solutions to messy situations. The story line on education, at this ill-tempered moment in American life, expresses what might be called the Noah’s Ark view of life: a vast territory looks so impossibly corrupted that it must be washed away, so that we can begin its activities anew, on finer, higher, firmer principles. One should treat any perception that something so large is so completely awry with suspicion, and consider that it might not be true—especially before acting on it.
We have a lot of recent experience with breaking apart large, old, unlovely systems in the confidence of gaining great benefits at low cost. We deregulated the banking system. We tried to remake Iraq. In education, we would do well to appreciate what our country has built, and to try to fix what is undeniably wrong without declaring the entire system to be broken. We have a moral obligation to be precise about what the problems in American education are—like subpar schools for poor and minority children—and to resist heroic ideas about what would solve them, if those ideas don’t demonstrably do that.

Obama's education interview

Visit msnbc.com for breaking news, world news, and news about the economy

I'll have to think about this one for a little while.

Propensity Score Matching

The latest from Peter Austin (University of Toronto):

Propensity-score matching is increasingly being used to estimate the effects of treatments using observational data. In many-to-one (M:1) matching on the propensity score, M untreated subjects are matched to each treated subject using the propensity score. The authors used Monte Carlo simulations to examine the effect of the choice of M on the statistical performance of matched estimators. They considered matching 1–5 untreated subjects to each treated subject using both nearest-neighbor matching and caliper matching in 96 different scenarios. Increasing the number of untreated subjects matched to each treated subject tended to increase the bias in the estimated treatment effect; conversely, increasing the number of untreated subjects matched to each treated subject decreased the sampling variability of the estimated treatment effect. Using nearest-neighbor matching, the mean squared error of the estimated treatment effect was minimized in 67.7% of the scenarios when 1:1 matching was used. Using nearest-neighbor matching or caliper matching, the mean squared error was minimized in approximately 84% of the scenarios when, at most, 2 untreated subjects were matched to each treated subject. The authors recommend that, in most settings, researchers match either 1 or 2 untreated subjects to each treated subject when using propensity-score matching.

This result is quite interesting. It's intuitive if you think about it for a bit (the closet matches will be the best possible controls) but it varies from the wisdom of case control studies a lot (always use between 4 and 20 controls per case, if possible, so that the size of the confidence intervals is dependent on the cases).

I think that there are two things that need to be considered. Peter Austin works with ICES which uses prescriptions claims from the province of Ontario. So the types of study that he works with are typically large (and even his small samples were 500 cases). So variance is low, anyway, and a focus on bias makes perfect sense.

Second, complex propensity scores (based on many variables) are rarely the same for any two participants whereas the matching in case control studies is often on factors (age, sex) that can be perfectly matched.

So it is a useful and interesting result. What I really want to know, having never managed to get AJE to accept a paper from me at all, is how he managed this feat:

Received April 21, 2010
Accepted June 18, 2010

Impressive!

This American Life on HCZ's Baby College

Joseph and I have spilled a lot of pixels recently trying to debunk some of the bad statistics coming out of the educational reform movement. We have, perhaps, gotten so caught up in that task that we have neglected some of the bright spots in the movement.

This is one of those bright spots and it's definitely worth paying for the download:

Act One. Harlem Renaissance.
Paul Tough reports on the Harlem Children’s Zone, and its CEO and president, Geoffrey Canada. Among the project’s many facets is Baby College, an 8-week program where young parents and parents-to-be learn how to help their children get the education they need to be successful. Tough’s just-published book about Geoffrey Canada and the Harlem's Children Zone is called Whatever It Takes. You can see a slideshow of more photographs from the project here. (30 and 1⁄2 minutes)

p.s.I'm still not entirely comfortable with some of the research around HCZ, but, as I said earlier, it's "an impressive, even inspiring initiative to improve the lives of poor inner-city children through charter schools and community programs."

Sunday, September 26, 2010

"Crybabies"

With all do respect to Professors DeLong, Krugman and company, probably the best piece of reporting you'll find on the recent wave of self-pity among the wealthy is this revealing and appalling segment from the This American Life episode, "Crybabies":

Act One. Wall Street: Money Never Weeps.
Ira with Planet Money economics correspondent Adam Davidson on why—even after everything President Obama has done to save Wall Street, actions which have led to record profits and bonuses—Wall Street seems ungrateful. Adam and producer Jane Feltes head out to a Wall Street bar where they're told by three finance guys that there's no reason to thank the President for saving their jobs. Planet Money is a co-production of This American Life and NPR News. (14 minutes)

The episode is available for free download this week (though you might feel a little better about yourself if you donate a buck or two).

Principal Agents

I was talking to Mark about his post on retnetion policies in large corporations. One item came up that I think is quite interesting. Both schools and publically traded companies have a serious principal agent problem. In the case of the schools, the principals and school board act on behalf of the taxpayer. In the case of the publically traded company, the CEO and board act on behalf of shareholders.

One thing that we see in large companies is that it is impossible to completely eradicate the conflicts of interest that are so posed. Management will try to optimize their outcomes (see CEO pay) even whe it might not be in the best interest of the shareholders (see Mark's post).

In the same sense, it would be naive to assume that you won't have some of these principal agent problems happening in any publicly funded educational system. I suspect that this is the price we have to pay for having a universal education system. Shifting our education system to a more corporate model isn't going to remove these issues -- it is only going to change who the winners and losers in the system are.

Should we tolerate these issues? Why not private schools (with a much more direct link between the education producer and consumer)? I think the real reason is that a broadly educated population is a public good and that there are always going to be inefficiencies in providing public goods. We all know of cases where road construction was less than ideal (in terms of contractor extracting extra value). But that doesn't mean we whould either abandon roads entirely or go to subscription-based roads.

The trick here seems, to me, to be to develop an education system that provides high quality outcomes. Mark keeps asking why the Canadian model isn't more widely studied given that they have issues with a multi-cultural population, geographical distance, and english as a second language students. I think the conversation would benefit from seeing more about how they handle this principal agent issue.

"Ignore the parts about crystal meth and pancakes"

The education reform movement relies heavily on anecdotes of remarkable, odds-beating schools, but when you take a close look at those schools that had significantly superior performance, some if not most of the difference in scores could be explained by selection and peer effects.

This isn't to say these schools weren't benefiting their students. Regardless of the reason, these kids were better off. Nor is this to say that these schools weren't doing something right. I can tell you that many are well-run and highly innovative.

But even taking all of that into account, selection and peer effects are huge and can swamp almost any other factor you can think of. These effects are seldom if ever adequately accounted for (And before anyone says the word 'lottery,' please take a look at this). This makes it all but impossible to accurately measure the impact of these schools but people like Jonathan Chait continue to cite them without any caveats.

I came across a segment of This American Life that beautifully captured my feelings on the subject. Just play the clip below and every time you hear 'heroin,' substitute in 'selection and peer effects' (you can just ignore the parts about crystal meth and pancakes).

From Kumail Nanjiani:

So remember, selection and peer effects are doing the heavy lifting.

Saturday, September 25, 2010

The trouble with clever theories

This issue is one of the more serious ones in modern academic research. Frances Woolley of Worthwhile Canadian Intiative details the case of Hepatitis B and missing women. The theory (that the imbalanced sex ratio seen in India and China could be explained by rates of viral infection) advanced by Emily Oster was both compelling and incorrect:

Someone arguing in Levitt's defence might say "well, no one could have known that Oster's hypothesis would turn out to be wrong." Could they? In 2005, the year that Oster's paper appeared in the JPE, Monica Das Gupta published a rebuttal in the Population and Development Review. She describes the results of a 1993 paper by Zeng et al, one cited by Oster:

...the sex ratio at birth varies sharply by the sex composition of the living children the woman already has.... Zeng et al. show that the sex ratio at birth was normal (1.056) for first births. For second births, it was strikingly different depending on whether the first child was male or female: women whose first child was a son had a low sex ratio (1.014) for the second child, while those whose first child was a daughter had a very high sex ratio (1.494) for the second child.

To produce a pattern like that, Hep B has to be one heck of a smart virus. So the first point is: anyone with even a passing familiarity with the literature would know there was something suspicious about the Oster results.

This is actually a really good point and one that deserves more thought. The hypothesis being put forward had very little chance of being true given the actual literature cited and yet it was widely accepted as an important theory (being given wide publicity). Why is this?

I think that modern academics love the counter-intuitive theory that turns conventional thinking on its head. These are compelling stories because they seem to show how careful observation and being clever can reveal important secrets. But the very fact that these theories rely on clever stories and unexpected twists makes them more likely (and not less likely) to be incorrect.

In a sense, this feature is what I dislike about instrumental variables. One needs to tell a story about why an instrument actually has the correct statistical properties. But this relies on strong and unverifiable assumptions that cannot be directly tested. So one ends up telling an interesting story . . . but it is one that could well be wrong.

Dr Oster is a very good scientist and I don't want to generalize to the rest of her work. But it is a trap we should all look out for!

One more thought on seeing McKinsey's “Closing the Talent Gap”

(And then I really have to be going)

I'm not sure they want to go with TIMSS here.

From the McKinsey report:

While Singapore does not participate in PISA, it ranked in the top three on math and science on the quadrennial Trends in International Mathematics and Science Studies assessments in 2007, after having come in first place in 1995, 1999 and 2003.

from the National Center for Education Statistics:

In 2007, the average mathematics scores of both U.S. fourth-graders (529) and eighth-graders (508) were higher than the TIMSS scale average (500 at both grades). The average U.S. fourth-grade mathematics score was higher than those of students in 23 of the 35 other countries, lower than those in 8 countries (all located in Asia or Europe), and not measurably different from those in the remaining 4 countries. At eighth grade, the average U.S. mathematics score was higher than those of students in 37 of the 47 other countries, lower than those in 5 countries (all of them located in Asia), and not measurably different from those in the other 5 countries.

My first thought on seeing McKinsey's “Closing the Talent Gap”

Where's Canada?

Here's the PDF of the report via this post from Matthew Yglesias. It just crossed my desktop and I have to be on the road in about five minutes. I've just had time to skim the report so I may be missing the obvious but the absence of our northern neighbor strikes me as strange, particularly given the report's use of PISA data.

Forget teachers-- hell, forget employees, what does it take to fire a CEO?

One of the fundamental tenets of the modern educational reform movement is faith in the private sector. In the last post, I discussed the contradictions in using that faith to justify attrition policies that are pretty much unheard of in the corporate world.

There's a second potential danger in looking to the private sector for answers. Companies are not very transparent. Most go to great lengths to hide incompetence and depict every effort as a success. There's nothing illegal or even unethical about this. If anything, the people who run a company have an obligation to present it in the best possible light.

Though you can't blame businesses for spinning their results, you can get into a great deal of trouble by imitating them. For example, a school system might adopt an innovative system of project management and never know that it was responsible for hundreds of millions in cost overruns.

Occasionally, however, you will run into a corporate screw-up so massive that no degree of opacity, no amount of spin can obscure it. When you encounter one of these, you should take a moment to remind yourself that the snafus that break the surface represent a minute share of the general population.

Which brings us to Jeff Zucker.

Zucker was brought in as president of NBC Entertainment in 2000 after a stint at the Today Show where his most notable accomplishments were moving the studio and introducing the Today Show's outdoor rock concert series.*

His tenure on the Today Show represented one of Zucker's two specialities: making tiny tweaks to a hit then claiming credit for its success. The other speciality was screwing up on an almost biblical scale. Under Zucker, NBC was the first network to ever go from first to fourth place and he came very close to destroying their lucrative late night slate. According to an executive for another network (quoted by Maureen Dowd), "Zucker is a case study in the most destructive media executive ever to exist... You’d have to tell me who else has taken a once-great network and literally destroyed it."

Zucker was grossly incompetent. The cost to share holders is difficult to estimate but it's probably in the hundreds of millions (possibly billions**). His poor performance was widely discussed in the industry.

And yet it took a change of ownership to force him out and he still gets terms like these:

Zucker's contract had been renewed last year to run through January 2013 with an annual salary of $6.3 million and a guaranteed annual bonus*** of $1.5 million. If he leaves by January, he can expect at least a $15.6 million check.

The moral of this story is: next time people tell you that schools should be run like a business, make sure to ask them which business they have in mind.

* Apparently the Today Show has an outdoor rock concert series.

** Here are some numbers from Wikipedia to put things in context:

On December 1, 2009, CNBC reported that a tentative agreement had been reached between Comcast and GE.^[26] The deal was formally announced on December 3, 2009.^[7] Under the agreement, NBC Universal would be 51% owned by Comcast and 49% by GE. Comcast is to pay $6.5 billion cash to GE. Comcast will also contribute $7.5 billion in programming including regional sports networks and cable channels such as Golf Channel and E! Entertainment Television. GE plans to use some of the funds, $5.8 billion, to buy out Vivendi's 20% minority stake in NBC Universal.^[7] After the transaction completes, Comcast will reserve the right to buy out GE's share at certain times. GE will also reserve the right to force the sale of their stake within the first seven years. The deal is subject to regulatory approval.^[7]

Vivendi will sell 7.66% of NBC Universal to GE for US$2 billion if the GE/Comcast deal is not completed by September 2010 and then sell the remaining 12.34% stake of NBC Universal to GE for US$3.8 billion when the deal is completed or to the public via an IPO if the deal is not completed.^[27]^[28]

*** I just love the idea of a "guaranteed annual bonus."