West Coast Stat Views (on Observational Epidemiology and more)

Thursday, September 30, 2010

"Phase one: change incentive system"

Prepare for a rare intra-blog dispute. Joseph found something to like in Megan McArdle's reply to Felix Salmon's post on airports. Here's the relevant section from the Atlantic:

Still, I think there's quite a lot about American airports that is important, and inadequate. Given the ubiquity of electronic devices, and the importance of airports to business travelers, we could probably enhance national productivity quite a bit if so many airports didn't force travelers to spend their wait times fighting each other for the one electrical socket located behind an out-of-order ATM machine. The ridiculous security theater procedures which have queues stretching out towards the long-term parking lot could be streamlined. And whatever engineer designs monstrosities like Heathrow's 40-minute walk-time from security line to gate should be tracked down and . . . um . . . reeducated, or something.

We might also give serious thought to whether something can be done about the incentives system--and local authorities--who fix things so that the only important customers of airports are the airlines. In many places, a combination of zoning, and the local authorities who often run the airports, means that there's no meaningful competition. The result is that they don't have to do anything to please passengers, and boy, they sure don't. If Felix's point is that improving the airports is probably not going to be a matter of huge government expenditures--or that this is not the best use of said expenditures--I'm pretty sure I agree. But we might think about regulatory changes that would give them reasons to improve.

For starters, I can't award any points here for criticizing the TSA. Everybody hates the TSA. Joseph does. I do. Felix Salmon does. Pretty much anybody who's been in an airplane in the past nine years does. You can't give a writer credit for voicing a universal opinion.

What's left?

A couple of complaints about airport amenities and some wonderfully McArdlesque notions about how market forces work.

It's certainly true that airports could provide a more pleasant experience for passengers but, as Felix Salmon points out, "[T]he airlines are the customers, and the passengers are the goods being transported." This isn't a case of misalligned incentives; this is the business model.

As far as I can tell, the Atlantic post doesn't propose an alternate business plan or a slate of restrictive airport regulations (which are far more difficult to justify than additional regulations for flying). Instead you get a classic McArdle action plan:

Step One -- Relax zoning and other restrictions.

Step Two --

Step Three -- Traveler's paradise.

Just how far apart are steps one and three?

Even with the laxest possible restrictions, there are practical limits on where you can put airports. For one thing it's kind of important to space them out. Would a change in zoning rules add enough airports to create meaningful competition? Almost certainly not.

And even if it did, that would do nothing, absolutely nothing, to reform the more absurd TSA policies. An improvement in airports that doesn't address the security procedures is a damned small improvement.

What's worse, there's reason to believe that even those small improvements wouldn't come to pass. Under the current business model, the only leverage passengers have is the option of voting with their feet, avoiding an airport in such large numbers that the airlines pressure that airport to improve.

Unfortunately, this runs into the same ugly business reality that airlines have encountered when trying to attract customers with all-class amenities. When you go online to book a flight you are normally presented with a handful of choices that vary widely in price* and convenience. The variation in these factors tends to swamp everything else. Factors like a more comfortable plane or better customer service only come into play in the case of a very close tie. This is one reason why most airlines moved away from pushing amenities and focused on loyalty rewards programs.**

Airlines are far more competitive than airports will ever be and most people value a comfortable flight much more than a comfortable airport. If market forces haven't made flights pleasant, they aren't going to do any better with the places we wait for them.

* As Dave Barry explains it:

Q. Airline fares are very confusing. How, exactly, does the airline determine the price of my ticket?
A. Many cost factors are involved in flying an airplane from Point A to Point B, including distance, passenger load, whether each pilot will get his own pilot hat or they're going to share, and whether Point B has a runway.

Q. So the airlines use these cost factors to calculate a rational price for my ticket?
A. No. That is determined by Rudy the Fare Chicken, who decides the price of each ticket individually by pecking on a computer keyboard sprinkled with corn. If an airline agent tells you that they're having "computer problems, " this means that Rudy is sick, and technicians are trying to activate the backup system, Conrad the Fare Hamster.

** Because most people tend to go to the closest airport, an airport loyalty program would spend most of its budget on customers it would have gotten anyway. Bad idea.

Airports

It is not everyday that I agree with Megan McArdle over Felix Salmon. But this is an extremely good point:

Still, I think there's quite a lot about American airports that is important, and inadequate. Given the ubiquity of electronic devices, and the importance of airports to business travelers, we could probably enhance national productivity quite a bit if so many airports didn't force travelers to spend their wait times fighting each other for the one electrical socket located behind an out-of-order ATM machine. The ridiculous security theater procedures which have queues stretching out towards the long-term parking lot could be streamlined.

To be blunt, the modern American airport seems designed to make a basically unpleasant activity (flying around in a packed airplane) as unpleasant as possible. Your humble narrator has been doing a lot of travel lately and I remember airports as being more pleasant once. For example, when you did not have to go through a lengthy screening process then you did not have to arrive as early at the airport. As a result, the airport had fewer bored people sitting around competing for limited seating and eating facilities.

So I would also be in favor of finding ways to make it easier for airports to make flying a pleasant experience.

But I can definitely see the GOP as Lucy

From the Boston Globe by way of Mippyville, here's some mildly warped satire to end your day. Not as funny as it might have been but lovingly executed.

(depending on your browser, you might need to click on the image to see the fourth panel.)

Wednesday, September 29, 2010

The heroin's still doing the heavy lifting -- why Ivy League legacies work

From Christopher Shea's Boston Globe column:

Richard D. Kahlenberg, editor of the forthcoming book "Affirmative Action for the Rich: Legacy Preferences in College Admissions," points out that universities in other countries don't give so-called legacy preferences to sons and daughters of their alumni. (Even Oxbridge colleges don't, despite the class-bound history of British education.) So, he asks, why on earth do we do it in America?

Broadly speaking, students go to college in search of four things: certification; instruction; reputation; and connections.

In terms of certification, any well-accredited school would do. In terms of undergraduate instruction, the best deal for the money (and perhaps the best deal period) is the small four-year school. (I'm leaving this as an assertion but I'm fairly confident I can argue the point if anyone wants to debate.)

In the next two categories, however, the Ivy League cannot be surpassed, in part because of the legacy system.

Without loss of generality, look at Harvard. The student population of the school consists entirely of two overlapping groups: people who can get into Harvard; people whose parents can get them into Harvard.

The first group is hard-working, ambitious and academically gifted. Assuming the number of need-based legacies is trivial, the second group comes from families that are wealthy (they're paying for a Harvard education) and well-connected (at least one parent went to Harvard).

Putting aside luck, you can put the drivers of success into three general categories: attitude, drive and work habits; talent, intelligence and creativity; reputation and connections. It is possible to succeed with just one of these (hell, I can think of people who made it with none), but there is a strong synergistic effect. A moderate talent who works hard and has connections will generally go farther than a spectacular talent who's lazy and isolated.

Connections are governed by the laws of graph theory. I'm not going to delve too deeply into the subject (since that would require research and possibly actual work on my part), but as anyone who has read even the cover blurbs on Linked or Small Worlds can tell you, adding a few highly connected nodes (let's call them senator's sons) can greatly increase the connectivity of a system.

It would be interesting to model the trade off between picking a well connected legacy over a smarter, harder-working applicant given the objective of producing the greatest aggregate success. Because of the network properties mentioned above, it wouldn't be surprising if the optimal number of legacies turned out to be the 10% to 15% we generally see.

Optimized or not, this mixture is almost guaranteed to churn out fantastically successful graduates regardless of what the schools do after the students are admitted. I'm certain the quality of instruction on the Ivy League schools is very good, but, like most education success stories, the secret here is mostly selection and peer effects.

Update: For a different interpretation (this time with actual data), check out this post at Gene Expression.

Updated update: Why doesn't spell check work in the title field?

Ecological Fallacy

The recent debate on education has been pretty broad and I wonder if a key point has been overlooked. In the cross country comparisons, Mark has been arguing that "even if we grant that cross country comparisons are useful, they don't necessarily say what you think they say".

This has led to overlooking the key problem in these comparisons: it is nearly impossible to relate the performance of a specific country to that of individual students because the make-up of the students differ. The real question we want to ask is: if we adopted the educational system of country X then would our students do better? But that the student in country X do better than Americans is no proof -- maybe they are handicapped by an inferior educational system and would do even better in an American style educational environment.

This problem is a key limitation in epidemiological studies of all kinds. If we observe that the Japanese have fewer myocardial infarcts then we still don't know why. There are many exposures that could explain this finding. It is easy to go wrong. For example, in 2002-2003 47% of Japanese men smoking and 20% of American men smoked.

Yet in 2002, the age standardized death rate for American men from CHD was 216 per 100,000 whereas it was 54 per 100,000 in Japan. (now this association is cross sectional but most cross country educational debates are as well). Does this mean we should promote smoking to reduce heart disease?

If the answer is "no" then we should apply similar caution to other international comparisons.

"The war situation has developed not necessarily to Japan's advantage"

You gotta love the headline writer for the New York Times:

"Republicans’ Deficit-Cut Pledge Lacks Specifics"

Over on the op-ed page* Paul Krugman is a little more blunt:

"‘Pledge to America’ is at war with arithmetic"

Here are some of Krugman's specifics:

True, the document talks about the need to cut spending. But as far as I can see, there’s only one specific cut proposed — canceling the rest of the Troubled Asset Relief Program, which Republicans claim (implausibly) would save $16 billion. That’s less than half of 1 percent of the budget cost of those tax cuts. As for the rest, everything must be cut, in ways not specified — “except for common-sense exceptions for seniors, veterans, and our troops.” In other words, Social Security, Medicare and the defense budget are off-limits.

So what’s left? Howard Gleckman of the nonpartisan Tax Policy Center has done the math. As he points out, the only way to balance the budget by 2020, while simultaneously (a) making the Bush tax cuts permanent and (b) protecting all the programs Republicans say they won’t cut, is to completely abolish the rest of the federal government: “No more national parks, no more Small Business Administration loans, no more export subsidies, no more National Institutes of Health. No more Medicaid (one-third of its budget pays for long-term care for our parents and others with disabilities). No more child health or child nutrition programs. No more highway construction. No more homeland security. Oh, and no more Congress.”

* I could have linked this directly to the New York Times, but frankly they annoy me.

p.s. Keep an eye on TNR and Economist's View for reactions to the rest of David Leonhardt's article. When Chait, Krugman, DeLong and company get to Leonhardt's defense of Paul Ryan and Rand Paul, things are going to get bloody.

Tuesday, September 28, 2010

Doesn't Microsoft have a template like this on Word?

Check out the ultimate in science reporting here, then watch Andrew Gelman experiment with meta-posting.

Was I too tough on Seyward Darby?

We've recently had a huge uptick in viewers. This is great (and I'm tremendously appreciative), but I do have to remind myself most of the people in the room walked in on the middle of the conversation.

If you do a keyword search, you would see we've been hammering away at Ms. Darby for a long time and you might get the impression that we are trying to depict her as the mad relative chained up in the New Republic's attic. We're not and she isn't. (TNR's attic position was filled long before she got there.)

Ms. Darby, a talented and well-qualified journalist, takes the brunt of the criticism because she writes the bulk of TNR's articles on education, not because those articles deviate from the magazine's position on, well, anything.

The indispensable Jonathan Cohn has handed his blog over to Darby's education posts while Jonathan Chait, arguably the best political blogger out there, has taken an almost identical position on the subject.

I addressed Chait's uncharacteristic education writing a couple of weeks ago (seems like longer) in "Strange Bedfellows":

There's too much here to cover in one post (I could do a page just on Chait's weird reaction to Samuelson's looks, a topic that I had never given any thought to up until now). I may take another pass at another section later but for now I'm going to limit myself to this particularly egregious bit:
How does Samuelson explain the existence of new charter schools that produce dramatically higher results among these lazy, no-good teenagers? He insists, "no one has yet discovered transformative changes in curriculum or pedagogy, especially for inner-city schools, that are (in business lingo) 'scalable.'" This is utterly false. The most prominent example is the Kipp schools, which have shown revolutionary improvements among poor, inner-city students and have rapidly expanded.
It is strange to see Chait take the pro-privatization side of the debate, stranger still to see him accuse critics of charter schools of having an anti-government bias*, but what pushes this into Rod Serling territory is the spectacle of having Chait, one of the most gifted bullshit detectors of the Twenty-first Century, rolling out the same sort of flawed argument that he has made a career out of dismantling.

In order to be viable, a reform has to improve on the existing system by a large enough margin to justify its implementation costs, but if you accept the metrics used by the reform movement, then you will have to conclude that charter schools do worse than public schools more often than they do better.**

So we have a major push to privatize government services which, after about two decades of testing have been shown to under-perform their traditional government-run alternatives. Rather than show why this statistic is misleading, Chait pulls out vague, anecdotal evidence of a single out-lier. Now, given the variability of the data, we would expect the top schools (or even chains) to do pretty well. That alone rebuts Chait's point, but it gets worse. Self-selection, peer effects and selective attrition*** all artificially inflate KIPP's results. When you take these factors into account, it's hard to make a compelling statistical case that even the best charter schools are outperforming public schools (though the second footnote still applies).

At the risk of over-emphasizing, this is Jonathan -- freaking -- Chait we're talking about, a writer known for his truly exceptional gift for constructing logical arguments and, more importantly, spotting the fallacies in the arguments of others. Under normal conditions, Chait would never fall for a badly presented argument-by-anomally, let alone make one, just as, under normal circumstances, a confrontation between Samuelson and Chait would result in little pieces of the former being scraped off of the walls of the Washington Post.

But Chait loses this confrontation decisively. From his ad hominem opening to his factually challenged close he fails to score a single point. And this is far from the only example of this odd reform-specific impairment affecting otherwise accomplished writers. OE has spilled endless pixels on the reform-related lapses, both statistical and rhetorical, of smart, serious, dedicated people like Chait, Seyward Darby and, of course, Ray Fisman (just do a keyword search). None of these people would normally produce the kind of work we've cataloged here. None of them would normally ignore the defection of one of the founding members of the reform movement. None of these people would normally feel comfortable dismissing without comment contradictory findings from EPI, Donald Rubin and the Rand Institute.

David Warsh has aptly made the following comparison:
Remember the recipe for a policy disaster? Start with a handful of policy intellectuals confronting a stubborn problem, in love with a Big Idea. Fold in a bunch of ambitious Ivy League kids who don’t speak the local language. Churn up enthusiasm for the program in the gullible national press – and get ready for a decade of really bad news. Take a look at David Halberstam’s Vietnam classic The Best and the Brightest, if you need to refresh your memory. Or just think back on the run-up to the war in Iraq.
but along with Halberstam, it might be time to brush off our copies of Cialdini's Influence.

From a data standpoint, the past few years have been rough on the reform movement. Charter schools have been shown to be more likely to under-perform than to outperform. Joel Klein's spectacular record turned out to be the product of creative accounting (New York City schools have actually done much worse than the rest of the state). Findings contradicting the fundamental tenets of the movement accumulated. Major figures in research (Rubin) and education (Ravitch) have publicly questioned the viability of proposed reforms.

As Cialdini lays out in great detail, when you challenge people's deeply held beliefs with convincing evidence, you usually get one of two responses. Sometimes you will actually manage to win them over. More often, though, they will dig in, embrace their beliefs more firmly and find new ways to justify them.

I think it's safe to say we don't have response number one.

* Almost all of the major tenets of the modern reform can be traced back to the Reagan era and were closely associated with the initiatives described in Franks' The Wrecking Crew.

** Ironically, if you consider the intellectual framework of the reform movement to be flawed and overly simplistic, you can actually make a much better case for charter schools.

*** From Wikipedia: "In addition, some KIPP schools show high attrition, especially for those students entering the schools with the lowest test scores. A 2008 study by SRI International found that although KIPP fifth-grade students who enter with below-average scores significantly outperform peers in public schools by the end of year one, "... 60 percent of students who entered fifth grade at four Bay Area KIPP schools in 2003-04 left before completing eighth grade."[7] The report also discusses student mobility due to changing economic situations for student's families, but does not directly link this factor into student attrition. Six of California's nine KIPP schools, researched in 2007, showed similar attrition patterns.[citation needed] Figures for schools in other states are not always as readily available."

[You can find more on KIPP here]

Encouraging/Discouraging words from President Obama

(Time for another trip into N-Space -- test/retention axis)

From the President's NBC interview:

What is your message to the leadership of unions and to teacher union members?
"We want to work with you; we're not interested in imposing changes on you...you can't defend the status quo in which a third of our kids are dropping out...when you've got 2,000 schools across the country that are drop-out factories, in those schools you have to have radical change. ... The vast majority of teachers want to do a good job, they're not in it for the money. ... Ultimately if some teachers are not doing a good job they've got to go."

This is encouraging because, unlike the test-scores scare, the drop-out crisis hasn't received nearly enough attention. After gangs and school violence, it is the problem that worries me the most.

This discouraging because many of the schools being held up as models by the reformers have built much of their reputation by systematically excluding and/or chasing away the very students who drop out of traditional schools.

The wonders of shifting rationales

(What a great week to be an education blogger)

The old case for sweeping reforms: The changes we're suggesting are drastic and costly with bad track records and the potential to severely damage the system they are meant to save, but the current state of education is so bad that the future of our country depends on implementing radical reforms.

The new case for sweeping reform: it turns out the problem is not that big or wide-reaching but it's good that people mistakenly thought it was so bad because that encourages us to make all these changes.

(h/t Matthew Yglesias)

Monday, September 27, 2010

Does Seyward Darby think Seyward Darby should be fired?

I don't. I have always believed that firing is a last resort, but Darby has certainly gone on the record as being for firing incompetent performers even when the metrics for measuring competence are unreliable and the firings would cause severe damage to the economy. Given the quality of her reporting on education, it's difficult to believe she's not currently lobbying TNR to get herself fired for things like this:

But the real star of the show was Waiting for Superman, the much-hyped documentary about school reform that opens nationwide this week. Gregory started the program with a clip from the movie that shows how poorly we rank, education-wise, against other developed countries.

Does the clip really show how poorly we're doing? Let's roll the tape:

Since the 1970s, U.S. schools have failed to keep pace with the rest of the world. Among 30 developed countries, we ranked 25th in math and 21st in science. The top 5 percent of our students, our very best, ranked 23rd out of 29 developed countries. In almost every category, we've fallen behind.

If you've been following OE, you recognize these oft-quoted numbers as coming from the PISA test which can't possibly support the first sentence since it was first administered in 2000.

It would seem that Seyward Darby doesn't know that.

She also doesn't seem to know that the older, better-established TIMSS test has us doing fairly well internationally. Nor is she apparently aware that using PISA to argue for the standard slate of reforms is problematic since at least one of the highest scoring countries (Canada) has adopted pretty much the opposite approach.

We can let David Gregory off with a warning -- this isn't his beat -- but Darby is the education specialist for one of America's best and most respected publications. There's no way for the New Republic to justify keeping an education reporter who can't spot obvious distortions involving one of the two best known measures of international academic performance.

I would suggest immediate reassignment. If Darby would like to argue for her own dismissal, I'd be happy to debate the issue.

[update: you can find some more thoughts on TNR's education reporting here.]

Perceived vs. Actual Income Distribution

James Fallows points out an interesting study (h/t to Jonathan Chait):

The context is the previous discussion, here and here, about the capacity for feeling short-changed and ill-treated, even among some of the most materially-fortunate people ever to live on Earth. No doubt it's a primal human trait, but for various reasons (as explained here) the ever-polarizing distribution of wealth and income in America has allowed more people to feel bad about their own situation by looking at the handful who are stratospherically better off.

To some extent this is an "information" problem: people don't know where they really stand. A creative way to demonstrate that is with a forthcoming paper by Michael Norton of Harvard Business School and Daniel Ariely of Duke, which compares: (a) how wealth actually is distributed in America; (b) how people think it's distributed; and (c) how they think it should be distributed. The paper is available in PDF here.

The chart below conveys the central point: people think the distribution of wealth is more equal than it actually is; and they think it should be much more equal than their already unrealistically-equal notion of its current state. Eg: the top 20% of the US wealth distribution actually controls nearly 85% of total wealth; people think the top 20% controls under 60%; and they think it should control just over 30%

Similarly: people feel that the bottom 20% of the economic pyramid "should" have about 10% of the total pie; they think it actually has about 3% or 4%; in fact, its share appears to be too small to show up on the chart.

Today's must read on education

Nicholas Lehmann writing for the New Yorker:

There have been attempts in the past to make the system more rational and less redundant, and to shrink the portion of it that undertakes scholarly research, but they have not met with much success, and not just because of bureaucratic resistance by the interested parties. Large-scale, decentralized democratic societies are not very adept at generating neat, rational solutions to messy situations. The story line on education, at this ill-tempered moment in American life, expresses what might be called the Noah’s Ark view of life: a vast territory looks so impossibly corrupted that it must be washed away, so that we can begin its activities anew, on finer, higher, firmer principles. One should treat any perception that something so large is so completely awry with suspicion, and consider that it might not be true—especially before acting on it.
We have a lot of recent experience with breaking apart large, old, unlovely systems in the confidence of gaining great benefits at low cost. We deregulated the banking system. We tried to remake Iraq. In education, we would do well to appreciate what our country has built, and to try to fix what is undeniably wrong without declaring the entire system to be broken. We have a moral obligation to be precise about what the problems in American education are—like subpar schools for poor and minority children—and to resist heroic ideas about what would solve them, if those ideas don’t demonstrably do that.

Obama's education interview

Visit msnbc.com for breaking news, world news, and news about the economy

I'll have to think about this one for a little while.

Propensity Score Matching

The latest from Peter Austin (University of Toronto):

Propensity-score matching is increasingly being used to estimate the effects of treatments using observational data. In many-to-one (M:1) matching on the propensity score, M untreated subjects are matched to each treated subject using the propensity score. The authors used Monte Carlo simulations to examine the effect of the choice of M on the statistical performance of matched estimators. They considered matching 1–5 untreated subjects to each treated subject using both nearest-neighbor matching and caliper matching in 96 different scenarios. Increasing the number of untreated subjects matched to each treated subject tended to increase the bias in the estimated treatment effect; conversely, increasing the number of untreated subjects matched to each treated subject decreased the sampling variability of the estimated treatment effect. Using nearest-neighbor matching, the mean squared error of the estimated treatment effect was minimized in 67.7% of the scenarios when 1:1 matching was used. Using nearest-neighbor matching or caliper matching, the mean squared error was minimized in approximately 84% of the scenarios when, at most, 2 untreated subjects were matched to each treated subject. The authors recommend that, in most settings, researchers match either 1 or 2 untreated subjects to each treated subject when using propensity-score matching.

This result is quite interesting. It's intuitive if you think about it for a bit (the closet matches will be the best possible controls) but it varies from the wisdom of case control studies a lot (always use between 4 and 20 controls per case, if possible, so that the size of the confidence intervals is dependent on the cases).

I think that there are two things that need to be considered. Peter Austin works with ICES which uses prescriptions claims from the province of Ontario. So the types of study that he works with are typically large (and even his small samples were 500 cases). So variance is low, anyway, and a focus on bias makes perfect sense.

Second, complex propensity scores (based on many variables) are rarely the same for any two participants whereas the matching in case control studies is often on factors (age, sex) that can be perfectly matched.

So it is a useful and interesting result. What I really want to know, having never managed to get AJE to accept a paper from me at all, is how he managed this feat:

Received April 21, 2010
Accepted June 18, 2010

Impressive!