West Coast Stat Views (on Observational Epidemiology and more)

Tuesday, February 12, 2013

Thinking about failure and collective amnesia.

No. Not in the sad my-life-adds-up-to-nothing way, but more in the case study sense. I've been noticing how often optimistic analyses of proposed products and business models echo the same arguments used over the years for various underperforming enterprises and catastrophic failures, invariably without a flicker of recognition.

No doubt, this is partly due to a general lack of long-term memory in the pundit class, but the problem seem particularly acute when it comes to failure. There are exceptions like this well-thought-out analogy by Josh Marshall or this piece of historical context for Zucker's Leno debacle from Kliph Nesteroff, but as a rule, most journalists don't pay nearly enough attention to these counterexamples (which makes it all the more difficult to avoid repeating mistakes).

I'll try to add some more entries and drill down into some of the specifics, but in the meantime, here's a short list of some potentially useful examples ideas that seemed (and in some cases, actually were) good ideas at the time.

The aforementioned mentioned attempt to make Jerry Lewis king of the talk shows.

A late Eighties format that doubled the resolution of video tapes while being completely compatible with standard VHS.

An attempt to break the Seventies DC/Marvel duopoly.

An attempt to break the Coke/Pepsi duopoly (perhaps breaking duopolies deserves a subcategory).

Adios, Amiga.

Friday, February 8, 2013

Fun with charts

This post by Daniel Kuehn is worth reading, although all of the action is in the comments.

I think he is right on about the denomintor problem in interpreting her graphs. It's also a very good example of when a point can be correct and yet not explain all of the differences (her comments about rounding). However, the labeled buldge seems to be a lesser sin than variable bracket sizes on a density plot.

As for the change argument, it is fine to use a chart to explain something and then talk about the expected changes to the distribution. Where I am less happy is that there are changes going on in the United States all of the time (aging of the population, propensity to form a new household) that are going to influence the shape of this curve. It is possible to imagine the curve shifting exactly as Jon Evans suggested, and the reasons having to do with factors that have nothigng to do with inequality.

But standardized curves have their own issues . . . So even Magan's use of the curve to show the shifts over time doesn't address the null conditional on the changes in the underlying population. This may even help her argument, I am not sure, but certainly I would rather graph density plots in equal sized segments just for reader clarity

Still, a worthwhile argument to follow and it is useful insofar as it improves understanding of what the plots do and do not mean.

Wednesday, February 6, 2013

Annals of bad analogies

From here:

Think about it this way. Say your elderly mother had to be hospitalized for life-threatening cancer. The best doctor in the region is at Sacred Heart, a Catholic, private hospital. Could you ever imagine saying this? “Well, I don’t think our taxpayer dollars should subsidize this private institution that has religious roots, so we’re going to take her to County General, where she’ll get inferior care. ’Cause that’s just the right thing to do!”
No. You’d want to make sure that your tax dollars got your mom the best care. Period. Our approach should be no different for our children. Their lives are at stake when we’re talking about the quality of education they are receiving. The quality of care standard should certainly be no lower.

An analogy is the weakest form of argument, because it presumes similarities between cases. In this case we are equating a one time event (cancer treatment) with a long term process (educating people). There is also a difference in that cancer outcomes are much easier to measure (due to the fast time between diagnosis and resolution) than an educational process. So "better" is much easier to evaluate. Finally, it ignores magnitudes. What is "better" and by how much. Is it a matter of preference (Starbucks coffee is better than McDonald's coffee) or an objective metric?

But this whole thing dodges the main question-- why is the County General hospital not competitive with the Sacred Heart hospital? Is that not the more interesting question? Is it because the County General can't turn patients away and so gets the sickest of the sick?

These points matter.

Monday, February 4, 2013

Imbalance in the infrastructure debate

Joseph's previous post builds on this thread from Mark Thoma. Each is worth reading but I think both understate the extraordinary asymmetry between the pro and con in the should-we-spend-on-infrastructure debate (distinct from the where-to-spend debate). Consider the following statements:

1. We need to repair and upgrade the country's infrastructure in the relatively near future (let's say a decade)

So far as I can tell, almost no one is willing to stand up and argue against this point, which is strange because, though I don't happen to agree with them, there are reasonable arguments to be made here and, once this point has been conceded, the remaining ground is extraordinarily difficult to defend.

2. The economy is not operating at full capacity

We've already stipulated that we need to build these things which means we've also agreed to tolerate at least some crowding out at some point in the future. You simply can't have one without the other; you can only seek to minimize the effect. A crowding-out argument for delaying pretty much has to assume that there will be more slack in the economy far enough in the future to make waiting worth it (but not so far to extend past our decade window). I've heard lots of people making crowding-out arguments but none making the necessary corollary. (even if you believe that crowding out is unaffected by economic conditions, you still don't have any argument for waiting)

3. Borrowing costs for the federal government are historically low.

As a general rule, repairs don't get cheaper the longer you put them off. This tends to put the burden of proof on those arguing for a delay. If we were living in a period of historically high borrowing costs, you could argue that rates were likely to head back down if we waited. There are reasonable cost-based arguments against infrastructure spending, but only in the spend/don't spend context, not spend now/spend later.

The infrastructure debate is another example of how the public discourse has entered a phase reminiscent of Carroll's Tortoise/Achilles tale, where showing the premise is true and showing the premise leads to a conclusion is not sufficient to make the other side accept that conclusion. Of course, it's not an exact analogy. Carroll was making a point about the limits of logical system. What we're seeing here is more probably a demonstration of people's willingness to ignore the rules of argument when those rules lead to an uncomfortable policy position.

Friday, February 1, 2013

Sometimes you do it because it is a good idea anyway

Mark Thoma looks back five years to reports of skepticism about infrastructure development. Part of this is that maybe we just need to relax our rules on project timelines a wee bit. But another piece of it is that the worst mistake we can have to to end up competing with the private sector for labor to make cool things that we will end up needing. As disasters go, this one is rather mild.

Thursday, January 31, 2013

Shamisen heroes and free TV

I had been working on another piece about over the air television when I happened to surf across this video on one of the many Asian-themed channels you can get with an antenna in LA and it struck me as an appropriate accompaniment for a quick note about over the air TV.

With a pair of rabbit ears I pick up programming in at least a half dozen languages. That's an indication of the diversity of the medium and its importance to some underserved segments of the population, but it also represents a real competitive weakness. Terrestrial television suffers from a crippling lack of attention. Both Journalists who cover both media and personal finance are almost completely oblivious to this innovative, totally free source of programming.

If you're a medium trying to get the attention of the mainstream media, having a large part of your viewership consist of recent immigrants is not going to help.

I'll leave you with something a bit more traditional from the Yoshida brothers, though still, well... Hell, just watch it.

Wednesday, January 30, 2013

Unintended consequences

I normally agree with pro-immigration stands, but this one strikes me as likely to do something different than expected:

While I think the region-based visa would be a positive step by itself, there is an additional twist I would recommend adding to the policy: require the purchase of a home from the visa recipient. This would be similar to the EB-5 program, which gives green cards to rich foreigners who invest in the U.S. This would allow non-rich immigrants to make an investment in the region sponsoring their visa. Not only does this increase the political popularity of the program and provide a way to transfer some of the gains of immigration to the native born population, but it also serves as an enforcement mechanism. Workers are less likely to leave the region their visa ties them too if they have made a large investment in that area which they cannot sell for the length of their visa.

I love the idea of a regional (i.e. state level) work VISA. But the house thing is a terrible idea. First of all, how do you get a loan? If the requirement is "cash on the barrelhead" then we are only opening the market up to wealthy immigrants. High skilled people just starting out are squeezed out of the market. Or you get an asset, with little money down bought by people with a weak understanding of the local market who have the ability to flee overseas if things collapse.

Nor am I sure we want even more price support for housing in the United States (mortgage deductions already support prices). And I doubt that, in world with sublets, that this would really help keep people local over and above the what we already have in place.

Plus, the rich enforcement mechanisms we have for work VISAs are already pretty scary, without adding this new level

Friday, January 25, 2013

Paul Krugman on progress

Paul Krugman gets skeptical:

By and large, I’m in the camp of those disillusioned about technology — mainly, I think, because the future isn’t what it used to be. A case in point is Herman Kahn’s The Year 2000, a 1967 exercise in forecasting that offered a convenient list of “very likely” technological developments. When 2000 actually did roll around, the striking thing was how over-optimistic the list was: Kahn foresaw most things that actually did happen, but also many things that didn’t (and still haven’t). And economic growth fell far short of his expectations.

It is often the case that Krugman has a relatively unique take on things. Still, he is slowly coming around to having some upside views on one innovation:

But driverless cars break the pattern: even Kahn’s list of “less likely” possibilities only mentioned automated highways, not city streets, which is where we will apparently be in the quite near future.

I have been seeing a lot of comments on automated cars lately. Of the West Coast Stat bloggers, I think I can be fairly described as the technological optimist. But even I worry that there could be some rather unexpected consequences to such a change, espeically if you have both humans and automated cars operating at the same time on the same roads.

Thursday, January 24, 2013

A different take on grades and money from home -- part 1 Hypothesis Shopping

New study claims more money from home is associated with lower grades.

I came across this in a post from Andrew Gelman who had some sharp criticisms for the analysis that lead to this conclusion (criticisms seconded by Joseph in these previous posts). I clicked through the links with a slightly bloodthirsty attitude, already toying with ideas for posts mocking Dr. Hamilton, the hapless researcher. As I followed through, though, I felt less and less like mocking and more inclined to a shaded, even positive view, particularly given my previously noted feelings about looking at research in context.

Not that I disagree with Andrew and Joseph's criticisms -- the survivor bias is really difficult to get past -- but on some big how-we-do-science questions, I see a lot to like here, starting with where Hamilton started (and where she didn't).

Back in 2004, Hamilton and a group of other researchers spent a year studying a group of mostly first year students at large university, then continued to track their progress and interview them for the next five years. One of the things they observed was that the students who didn't have to work or struggle financially due to checks from home tended to take school less seriously, study less and get worse grades. Hamilton decided to follow this up by seeing if this pattern held nationally.

The resulting analysis was not well done, but I like the general approach: use case studies, participant observation, interviews and similar techniques to study your subjects extensively, form your hypotheses based on those observations, see how they hold up when tested against more general data. (or, put more broadly, actually devote time and effort into coming up with your hypotheses.)

That may seem like damning with faint praise, but I think a lot about where hypotheses come from and I give quite a bit of credit to researchers who put the work in to do it right, particularly in an age of hypothesis shopping.

Hypothesis shopping is one of our two leading sources of bad studies. The other is the epicyclism. The first entails running through endless unlikely relationships until one turns up significant (sometimes mistaken for data mining by people who know nothing about data mining). The second entails coming up with increasingly convoluted hypotheses to fit the data, often to preserve an ideological position. Sadly there's a nontrivial overlap between the two.

Hypothesis shopping has always been around but only recently has it become what you might call "cost effective" due to huge advances in the availability of data and the power of computers. Today, anyone with a large data set and a late-model laptop can crank through thousands of possible relationships looking for something with a good p-value. Just start with a couple dozen potential dependent variables, a few hundred independent variables, and some reasonable sounding transformations and interactions. You are pretty much certain to find something at least as impressive and "significant" as the the findings you'll routinely see in Slate or the New York Times.

The results of these processes are often absurd enough to be obvious to everyone (with the exception of the aforementioned publications), but for every watching-sports_before-conception-makes-you-more-likely-to-have-a-boy (I really hope that's a made up example), there's a provocative but not easily dismissed story with huge policy implications. It would be useful to know if we're talking about an effect that has been observed in other contexts or if it's just the green jelly bean.

Knowing the provenance of a hypothesis doesn't protect us from bad research but it does, to a large degree, inoculate us against certain kinds of bad research.

Wednesday, January 23, 2013

Survivor bias

There has been some discussion about meony from parents leading to a lower GPA from both Andrew Gelman and I. One of the comments at Andrew Gelman's site got me thinking:

” The higher graduation rate of students whose parents paid their way is not surprising, she said, since many students leave college for financial reasons. (…)
Oddly, a lot of the parents who contributed the most money didn’t get the best returns on their investment (…) Their students were more likely to stay and graduate, but their G.P.A.’s were mediocre at best, and some I didn’t see study even once.”

What is the actual target of inference here? Is it GPA or is it graduation?

When I was in the corporate world I never was asked for my transcripts (my degree, all of the time but my transcripts never). Having 2 years of college and then dropping out leads to worse life outcomes than having a degree, so far as I can tell.

Or put it another way, what would you prefer:

A child who got high marks but did not complete their program?
A child who got low marks but earned a degree?

Insofar as university education is a credentaling and signaling system, the second piece would be far better.

But even more interesting, the authors comments support my intuition precisely -- parental funding keeps marginal students in school. From a causal perspective this is way more interesting than the headline effect of giving dropping grades and is way more intuitive as well.

Health Care spending

I often worry that comparison between the United States and Canada understate the room we have to improve health care costs. But Matt Yglesias points to a health care chart showing spending per capita by the government in the United States and Canada:

This is the chart that I think ought to dominate the conversation about public sector health care spending in the United States and yet is curiously ignored. The data show government health care spending per capita in the United States and Canada. The United States spends more. And that's not more per person who gets government health insurance, it's more per resident. And yet Canada covers all its citizens and we don't. That should be considered shocking stuff, and yet I rarely hear it mentioned.

Even odder is that the most recent time I heard it mentioned was Valerie Ramey talking at the American Economics Association conference in San Diego and her conclusion was that this showed U.S. health care needs free market reforms. The more straightforward interpretation, I would think, is that the U.S. needs to make its system more like Canada's. It's important to note that the example here is Canada. Not some radically different society. Not some far-off distant land. And the gap is actually growing.

In 2010, the Canadian government was spending roughly ~$3,000 per capita and the US government was spending ~$4,000 per capita. Not per beneficiary, but per capita!!

Now Canada is not some sort of dystopia without private medicine. Emergency services are fully covered but it was common when I was last there to go to a private clinic for health care like an X-ray to avoid the queue in the public options. But this is still catastrophic coverage for all citizens, which is an impressive feat.

At the end of the post, Matt talks a little about the less innovation counter-argument. Closely related is the MD shortage argument. Now there are two responses. One, is to note (as Matt does) that we want health care costs to go down in the United States and that this will have bad effects as well as good effects.

But the second is more compelling. We could just invest money in medical research and in awarding prizes for drug discovery. It would make innovation costs transparent and separate it out from rent-seeking and administrative costs (which would be lower in the single-payer insurance or out of pocket expenses world).

As for educating MDs, we can steal a leaf from Canada and subsidize education a wee bit more and lower wages will still have no trouble attracting people to a high status and still relatively high pay profession. No solution is perfect, but why is this one not being debated?

Tuesday, January 22, 2013

More reflections on a study (EDITED)

Okay, before I get to the meat of this post, this quote by Andrew Gelman is dynamite:

. . . I’m generally suspicious of arguments in which the rebound is bigger than the main effect.

How many "counter-intuitive" studies would survive this kind of skepticism. Not that a rebound effect can't be larger, but like many unlikely things it requires a higher level of proof.

The context is an education study which suggests that the more parents pay for education the lower the grades of the student will be. The authors apparently tried to control for a lot of possible confounders (like SAT scores) but the whole process ends up looking like "what not to do in regression analysis".

There is an intermediate variable (problem), a restriction of range problem (extrapolating parental support out to values that exceed annual income), and an issue with differential drop-out that does not seem to be addressed. All of these points are present in Andrew's nice write up.

What I want to focus on is the sharp counter-factual. I am not always a fan of counter-factual reasoning, but I think that it would provide a ton of clarity in this case. The real claim is that if you decreased exposure X (parental support) then you would increase outcome Y (GPA). The direct causal model would suggest that the fastest way to improve student grades would be to make your contributions zero. But, the last time I looked, Pell grants require a non-zero parental contribution in most cases (it is a little hard to tell precisely what the thresholds are but they definitely are not zero for most students). So clearly this is a floor on parental contributions (and if it was the only source of contributions the effect would become the richer the parents the worse the grades of the student).

So maybe, to have a realistic counter-factual, the exposure should be dollars of support above the minimum expected contribution?

So, really what we have to be talking about the the effect of a marginal dollar separate from the (non-linear) scale of what the parents are required to pay. But, even there, the direction is unclear. Imagine that your not especially inspired child gets admission to Stanford but they are struggling with the material. Do you insist, on principle, that they get a job or do you pay more so that they have a better chance to be a "C average" Stanford graduate (which is much better than a Stanford drop-out). So the causal direction is actually unclear.

But if the idea is that giving more resources to students decreases performance then there are a lot of experiments we could try. For example, we could decrease wages (for everyone including upper managment) and see if performance goes up. Or we could randomize students to improved levels of support. Better yet, we could look at experiments that have already been done:

We examine the impacts of a private need-based college financial aid program distributing grants at random among first-year Pell Grant recipients at thirteen public Wisconsin universities. The Wisconsin Scholars Grant of $3,500 per year required full-time attendance. Estimates based on four cohorts of students suggest that offering the grant increased completion of a full-time credit load and rates of re-enrollment for a second year of college. An increase of $1,000 in total financial aid received during a student’s first year of college was associated with a 2.8 to 4.1 percentage point increase in rates of enrollment for the second year.

So not only is the main effect in the opposite direction (at least in terms of retention) but it has precisely the impact on a GPA analysis that Andrew expects: students are more likely to leave with lower levels of support. Do we think that leaving school is completely independent of performance (that there is no GPA difference between the drop-outs and those who persist)? Or is parental support different, in some magic way, than government grant support? People are more careful stewards of government money than they are of money from their close community (and think about what this would mean for charity versus government welfare programs, if true)?

I agree that the current form of this study is impossible to interpret.

[EDIT: Talking with Mark, it is clear that I was unclear on one point above. The experiments show money from a specific source (i.e. government funding) go in a specific direction but don't at all address whether money from parents has a similar causal effect (Mark is promising to talk about this in a post himself). The issues of selection, intermediate variables, and experimental evidence from other sources are all important, but without re-analyzing the data it is impossible to prove the directionality of the bias. As an epidemiologist I am trained to speculate on bias direction/strength but I recognize that is all I am doing. ]

One more cognitive dissonance post

Andrew Gelman has a good, skeptical response to some questionable claims from Herbalife, but there's one point where we're in disagreement, not so much as to the conclusion as to the reasoning behind it.

Gelman says:

Amusingly, one of Herbalife’s points is “Fact: Majority of Former Distributors Would Recommend Herbalife to Friends and Family.” But that’s exactly what you’d expect of a still-active pyramid scheme, no? Existing members want new people below them on the pyramid. I’m not saying this means it is a pyramid scheme, but it doesn’t seem like evidence against the hypothesis!

Perhaps I'm misreading this but I'd assume that former distributors no longer have a direct interest in the company. If this is true, does this mean we can take former distributors as impartial judges? Not by a long shot.

This is where we segue back to a recent thread, cognitive dissonance and the psychology of marketing. Companies like Amway and Herbalife are textbook examples of marketing psych (literally for Amway, in two different chapters, no less). And it's important to remember that this relationship goes both ways. While the psychologists writing these texts study these companies, many of the executive in these companies have read these books and taken these classes and they've thought seriously about how best to apply these principles.

[My background in this field is spotty so I would highly recommend that you pick up a copy of Cialdini's Influence (either regular or textbook version) or some other good text on the subject and make sure I'm getting this right... ]

There have been a number of studies that show that when you convince people to believe something based on one reason, they have a tendency to come up with additional reasons to support that belief and that these reasons do not go away just because the original reason is removed. This effect is even stronger when the belief is stated publicly, particularly to friends and family or in writing (Amway training makes a big deal about getting things in writing).

The former Herbalife distributors are another one of those cases where what happened was exactly what the textbooks said would happen: people who had sold a line of products to friends and family in the past now tend to hold the reassuring belief that those products were good. This doesn't prove that they weren't good and it certainly doesn't say anything one way or the other about Herbalife being a pyramid; it simply serves as another reminder that things often happen the way your professor said they would.

Sunday, January 20, 2013

Alexandria Word Searches -- a new kind of puzzle for the weekend

I've got a post on a different kind of word search puzzle at You Do the Math. Mainly pitched toward teachers, but hopefully still fun for the general audience. Here's a Shakespeare-themed example with, at last count, eighteen Bard-friendly answers.

There are more puzzles (in larger formats) at the original post.

Saturday, January 19, 2013

"Edward Tufte Wants You to See Better"

I was looking for an interview for a post on the way we cover health issues, when I came across this interview with Edward Tufte. I haven't had a chance to check it out yet so this isn't really a recommendation but, given the depth of my to do list, I decided it would be better to pass it along now.