West Coast Stat Views (on Observational Epidemiology and more)

Thursday, January 27, 2011

One more exchange on Mankiw's assumptions

More from the ongoing debate.

Here's David (pulled from a longer comment):

I think where we disagree (assuming that we do disagree) is on where the burden of proof should lie. As an economist, and based on my reading of the theoretical and empirical literatures, the burden is on the individual who claims there are important plateaus and such. This requires showing empirically that they exist, and not in a general sense, but on the relevant margin of choice for those individuals. My general sense is that most economists would agree with this placing of the burden of proof, and your suggestion of the consensus of various economists is consistent with my impression as well. In other words, to assume that there are important plateaus on the margin requires empirical justification, and substantial justification because its very difficult to understand labor markets if we deviate generally even moderately from this productivity/wages relationship. So while you agree that “…if pundits' arguments are sufficiently robust or their assumptions are obviously true, they can do what Mankiw does.” I’d say that the consensus to me amongst economists supports the arguments and broader type of assumptions that I discussed previously. I suppose that’s an empirical question, for which I have not yet looked for data.

David,

It's easy to get lost in the weeds here, so I'll try to get a few specific points out of the way then address the bigger issue of of the way we treat assumptions in the economic debate.

First, when it comes to robustness, it is sufficient to show that deviating from an assumption would cause the model to fail. There is no need to show that a particular deviation (such as the possible plateaus I suggested) occurs, only that if it occurs problems will follow. The world is full of perfectly good models that are not robust. As long as the real world lines up closely enough with the model's assumptions, the lack of robustness is not an issue.

Robustness is, however, an issue when we go out of the range of data, and, given these unique times, every policy proposal goes outside our range of data. At this point the burden falls on the proposer to be explicit with assumptions and make some kind of case that they are being met.

We also need to be carefully to distinguish between individual and aggregate relationships. We know that raises and promotions occur at discrete points and bonuses are frequently capped. That means, for many workers, the relationship between wages and productivity can't be linear. It is, however, possible that when aggregated that relationship is linear (or at least close enough for our purposes). The problem here is that proposals that assume individual level linearity can sound a lot like proposal that assumes aggregate linearity. Once again, we need more caution and clarity than we've been seeing.

All of which lead to the main point: much of the economic debate (particularly Greg Mankiw's corner of it) has been based on arguments that aren't all that robust and assumptions that aren't immediately self-evident. Many of these arguments reach conclusions that are difficult to reconcile with the historical record (such as Mankiw's prediction that a return to Clinton era taxes would have dire effects on the nation). Under these circumstances, assumptions should not be left implicit and they certainly should not be depicted as broad and obvious when they are highly specialized and non-intuitive (Freakonomics being the best known example with Levitt's go-to "people respond to incentives." formulation).

In other words, in this situation, I'd probably argue that the burden of proof is on Mankiw; I'd certainly insist the burden of clarity is.

Wednesday, January 26, 2011

Evaluating the evaluations

Busy morning so I don't have time to do more than provide some links and the abstract for this paper on the effectiveness of college teachers.

In primary and secondary education, measures of teacher quality are often based on contemporaneous student performance on standard-ized achievement tests. In the postsecondary environment, scores on student evaluations of professors are typically used to measure teaching quality. We possess unique data that allow us to measure relative student performance in mandatory follow-on classes. We compare metrics that capture these three different notions of instructional quality and present evidence that professors who excel at promoting contemporaneous student achievement teach in ways that improve their student evaluations but harm the follow-on achievement of their students in more advanced classes.

Here's the ungated version via Tyler Cowen. May not be quite the same as the published one.

Here's Andrew Gelman's reaction.

Repost -- Fitness Landscapes, Ozark Style

[I'm working on a long post that uses fitness landscapes, so I thought I'd rerun some previous posts to get the conversation going.]

I grew up with a mountain in my backyard... literally. It wasn't that big (here in California we'd call it a hill) but back in the Ozarks it was a legitimate mountain and we owned about ten acres of it. Not the most usable of land but a lovely sight.

That Ozark terrain is also a great example of a fitness landscape because, depending on which side you look at, it illustrates the two serious challenges for optimization algorithms. Think about a mountainous area at least partially carved out by streams and rivers. Now remove all of the rocks, water and vegetation drop a blindfolded man somewhere in the middle, lost but equipped with a walking stick and a cell phone that can get a signal if he can get to a point with a clear line of sight to a cell tower.

With the use of his walking stick, the man has a reach of about six feet so he feels around in a circle, finds the highest point, takes two paces that direction then repeats the process (in other words, performs a gradient search). He quickly reaches a high point. That's the good news; the bad news is that he hasn't reached one of the five or six peaks that rise above the terrain. Instead, he has found the top of one of the countless hills and small mountains in the area.

Realizing the futility of repeating this process, the man remembers that an engineer friend (who was more accustomed to thinking in terms of landscape minima) suggested that if they became separated he should go to the lowest point in the area so the friend would know where to look for him. The man follows his friend's advice only to run into the opposite problem. This time his process is likely to lead to his desired destination (if he crosses the bed of a stream or a creek he's pretty much set) but it's going to be a long trip (waterways have a tendency to meander).

And there you have the two great curses of the gradient searcher, numerous small local optima and long, circuitous paths. This particular combination -- multiple maxima and a single minimum associated with indirect search paths -- is typical of fluvial geomorphology and isn't something you'd generally expect to see in other areas, but the general problems of local optima and slow convergence show up all the time.

There are, fortunately, a few things we can do that might make the situation better (not what you'd call realistic things but we aren't exactly going for verisimilitude here). We could tilt the landscape a little or slightly bend or stretch or twist it, maybe add some ridges to some patches to give it that stylish corduroy look. (in other words, we could perturb the landscape.)

Hopefully, these changes shouldn't have much effect on the size and position of the of the major optima,* but they could have a big effect on the search behavior, changing the likelihood of ending up on a particular optima and the average time to optimize. That's the reason we perturb landscapes; we're hoping for something that will give us a better optima in a reasonable time. Of course, we have no way of knowing if our bending and twisting will make things better (it could just as easily make them worse), but if we do get good results from our search of the new landscape, we should get similar results from the corresponding point on the old landscape.

In the next post in the series, I'll try to make the jump from mountain climbing to planning randomized trials.

* I showed this post to an engineer who strongly suggested I add two caveats here. First, we are working under the assumption that the major optima are large relative to the changes produced by the perturbation. Second our interest in each optima is based on its size, not whether it is global. Going back to our original example, let's say that the largest peak on our original landscape was 1,005 feet tall and the second largest was 1,000 feet even but after perturbation their heights were reversed. If we were interested in finding the global max, this would be be a big deal, but to us the difference between the two landscapes is trivial.

These assumptions will be easier to justify when start applying these concepts in the next post in the series. For now, though, just be warned that these are big assumptions that can't be made that often.

Tuesday, January 25, 2011

They may be anecdotal...

...but as recent events drive the mental health debate, cases like this take on an added significance.

"Why 3D doesn't work and never will. Case closed."

According to Roger Ebert (who should know), Walter Murch is "the most respected film editor and sound designer in the modern cinema" and, according to Murch, 3-D movies are still a bad technology.

Here's his main objection (from an open letter to Ebert):

The biggest problem with 3D, though, is the "convergence/focus" issue. A couple of the other issues -- darkness and "smallness" -- are at least theoretically solvable. But the deeper problem is that the audience must focus their eyes at the plane of the screen -- say it is 80 feet away. This is constant no matter what.

But their eyes must converge at perhaps 10 feet away, then 60 feet, then 120 feet, and so on, depending on what the illusion is. So 3D films require us to focus at one distance and converge at another. And 600 million years of evolution* has never presented this problem before. All living things with eyes have always focused and converged at the same point.

If we look at the salt shaker on the table, close to us, we focus at six feet and our eyeballs converge (tilt in) at six feet. Imagine the base of a triangle between your eyes and the apex of the triangle resting on the thing you are looking at. But then look out the window and you focus at sixty feet and converge also at sixty feet. That imaginary triangle has now "opened up" so that your lines of sight are almost -- almost -- parallel to each other.

We can do this. 3D films would not work if we couldn't. But it is like tapping your head and rubbing your stomach at the same time, difficult. So the "CPU" of our perceptual brain has to work extra hard, which is why after 20 minutes or so many people get headaches. They are doing something that 600 million years of evolution never prepared them for. This is a deep problem, which no amount of technical tweaking can fix. Nothing will fix it short of producing true "holographic" images.

Murch also makes important points about the editing and aesthetics of 3-D cinema, none of which are likely to make you rush out and invest your money in the technology, but that's just what the film industry has been doing.

As far as I can tell, it's been over seventy years since a customer-facing innovation (Technicolor) has revolutionized the cinema industry (distinguished here from home entertainment where the story has been entirely different). There have been customer-facing innovations but they've failed to catch on (Cinerama, color-based 3-D, Sensurround -- Imax has managed to stick around, but with less than 500 theaters after about four decades, it hasn't really been a game changer).

The innovations that actually had a major impact on the industry have been primarily focused on making films faster and quicker to make and easier to market (multiplexes, 'opening big,' digital production, post-production and projection, even CGI).

And yet studio executives continue to dream of the next Vitaphone.

*I'm not sure about the 600 million years -- how far back does stereoscopic vision go?

Monday, January 24, 2011

What's French for spam?

A few days ago, Seyward Darby posted a good piece of analysis on publishing teacher scores. Having been critical of Darby's previous writing on the subject, I entitled my response "Credit where credit is due." This morning I found the following comment complete with hyperlink:

Creadit - Vous cherchez crédit, credit d'impot, banque et bancaire [redacted] est une entreprise de courtage offrant les meilleures solutions pour pret relais et Credit en ligne au meilleur prix.

Or (according to Google Translate):

Cread - Looking for credit, tax credit, bank and banking [redacted] is a brokerage firm offering the best solutions for bridging loan and Credit online at the best price.

This should have been an easy catch for the spam filter. Was this just a random slip-up or does language make a difference?

I could not agree more

From Matt Yglesias:

Opioid addiction is bad, and it’s perfectly reasonable for policymakers to try to minimize its incidence. But short of dying, experiencing chronic pain is one of the worst things that can happen to someone. The correct ordering of priorities is to try to make sure that nobody suffering from treatable chronic pain goes untreated, and then try to minimize addiction risks within that framework. The view that people suffering pain should get relief subject to the binding constraint that we need to fight addiction has a nice Puritan logic to it, but it doesn’t make any real sense.

I have been enjoying Matt's comments on pain medication. The ability to relieve pain is one of the miracles of modern medicine and one that should not be squandered. There are always going to be limitations to any system but it is odd that we don't focus on those in need of pain relief first and abuse second.

After all, we don't ban driving just because some driver's are reckless or irresponsible.

Sunday, January 23, 2011

Health care and economies of scale

As mentioned before, I always like to be cautious when drawing conclusions from different countries, cultures and hemispheres. With that caveat out of the way, this Marketplace story about an Indian health insurance program is definitely interesting and possibly important as well.

Shetty and his team of 40 cardiac surgeons at Narayana Hrudayalaya Hospital are used to conversations like this one. They perform many more operations each year than comparable U.S. hospitals.
Shetty: This is a thousand-bed heart hospital. We do about 33 to 35 heart surgeries a day.

About a third of all of the patients at Shetty's hospital are farmers from rural villages. They're here because they have something called Yeshaswini insurance. It doesn't cover routine doctors visits for, say, a cough or a cold, but the insurance does cover all surgical procedures. The farmer pays approximately three cents a month; the government puts in one and a half cents and farmers cooperatives operate the program. Shetty believes there's strength in numbers.

For another story of medical developments coming from unlikely places, check out this story on battlefield medicine (this time from NPR).

Brewsterian astrophysics and Frazzian mathematics

Hey, it's a Sunday.

Saturday, January 22, 2011

Damned, pinko entrepreneurs

Whether in business, education, or any other field, what works there might not work here. With that caveat in mind, take a look at "In Norway, Start-ups Say Ja to Socialism" by Max Chafkin in Inc. magazine (or, if you're in a hurry, you can go to Felix Salmon's pithy analysis of the article and get much of the meat without the local color).

Here's the keystone of the piece:

Norway is also full of entrepreneurs like Wiggo Dalmo. Rates of start-up creation here are among the highest in the developed world, and Norway has more entrepreneurs per capita than the United States, according to the latest report by the Global Entrepreneurship Monitor, a Boston-based research consortium. A 2010 study released by the U.S. Small Business Administration reported a similar result: Although America remains near the top of the world in terms of entrepreneurial aspirations -- that is, the percentage of people who want to start new things—in terms of actual start-up activity, our country has fallen behind not just Norway but also Canada, Denmark, and Switzerland.

I tend to be distrustful of international comparisons, but, as I've mentioned before, if you're going to do it, Canada is probably where to start. "Demographically, economically, culturally and historically, Canada would seem to be the obvious country to look to when trying to determine the effectiveness of potential U.S. policies..." Having our northern neighbor on the list makes me more inclined to give its obvious implications some weight and those implications conflict with a lot of what we've been told about business-friendly policies.

According to much of the conventional economic wisdom, we should be leaving all of these countries behind in terms of entrepreneurs and new businesses. Their progressive taxes and extensive social safety nets should leave people with little incentive to work hard while their restrictive regulations (in Norway "firing an employee for cause typically takes months, and employers generally end up paying at least three months’ severance.") should make running a nimble and efficient business virtually impossible, but what should happen clearly isn't.

It would be interesting to see Greg Mankiw's explanation for this.

Kind and literate OE readers

Can someone out there help me find the name of a story. It's from the late Nineteenth or early Twentieth Century. In it a banker tells of a colleague who was ruined because he became obsessed with chess. I won't spoil the payoff but it's very funny and incredibly timely.

Thanks in advance,
Mark

Note to Gelman -- first fill its mouth with salt, then light candles, then decapitate

Andrew Gelman is once again going after the voting-is-irrational zombie (disinterred this time by the Freakonomics team). Gelman shows, using estimates that if anything err on the conservative side, that the possibility of influencing an election, though small, can still easily be associated with a reasonable expected value.

This particular zombie has been shambling through the dark corridors of pop econ books and columns for years now (Gelman himself has been monster hunting since at least 2005), but every time the creature seems truly dead and buried, along comes someone like Landsburg or Levitt, someone who's smart enough and mathematically literate enough to know better, but who just can't resist digging up the grave.

Friday, January 21, 2011

Weekend Gaming -- Agon

I don't know exactly what happened to Agon, but I'm pretty sure those damned orthogonalists had something to do with it. The game was all the rage in Victorian England where it was appreciated for its simple rules but surprisingly complex strategies. (the Victorians were also big fans of Lewis Carroll's Doublets, thus showing remarkably good taste in diversions.) The game was and is remarkably challenging and enjoyable but early in the Twentieth Century it faded away, perhaps due to the hegemony of orthogonal game boards.

You can find a complete set of rules on my game site. You can also buy boards there but obviously waiting for delivery would undercut the whole 'here's a game for this weekend' concept so I've included two JPEGs that you can print off if you can't find a suitable substitute (lots of games use a 6x6x6 hexboard so locating one shouldn't be that difficult).

The game is extraordinarily easy to learn. Each player starts out with six pawns and a queen spread out around the edge out the board.

A piece can either move around a concentric hexagon or go toward the center. The object is to get your pieces arranged like this (black wins):

'Capturing' is done in the style of many older games by placing two of your pieces on either side of the opponent's piece. I put the word in quotes because a captured piece is not removed from the board. Instead, it is moved back to the outer ring. Agon is therefore entirely a game of position. Novice chess players have a tendency to play for points and measure how well they're doing by how many of their opponent's pieces are lined up by the side of the board. Learning Agon can help break them of some bad habits.

I first came across Agon in David Parlett's Oxford History of Board Games -- an excellent resource if you're thinking about teaching a math class and not a bad read if you just enjoy games. Parlett is also a game designer of some note so he brings a lot of insight to the discussion.

Credit where credit is due

Seyward Darby has a thoughtful and balanced piece on education in today's New Republic. Here's the conclusion:

But, even when better evaluation methods exist, is it really a good idea to publish, en masse, the ratings of every public school teacher? I’m not convinced. Yes, the information should be available to those in the public who want it—namely parents. But schools or school districts, not newspapers, should share it with parents in a constructive manner, so that they are able to ask questions and understand fully what the information means. Teachers’ unions and districts should also use it to remove underperforming instructors from their jobs, and to ensure that no school has a high concentration of ineffective teachers, such that its student are getting the short end of the stick. And teachers should use it either to ask for additional training resources—or to gain recognition of the good, hard work that they’ve done.

I’m all for transparency. But a wide-open view of incomplete information isn’t what we need to improve education. What’s more, broadly publicizing even the most thorough of information isn’t always productive; complexities and nuances are often best conveyed in smaller settings, with the stakeholders who matter most.

The media shouldn’t focus on shaming individual teachers, because there are bigger fish to fry. Indeed, across the country, they should focus on shaming the entrenched bodies, structures, and policies that allow poor teaching to continue unchecked, fail to reward good teaching, and don’t provide enough support for teachers who want to improve their skills.

I don't agree with everything Darby says here and there are omissions that bother me (relying on Kyle Spencer's reporting probably didn't help). On the whole, though, this is an entirely reasonable piece of analysis. I complained earlier that there wasn't really an education debate in this country. Perhaps we're about to start one.

Alignment of Incentives

Professor in Training has a post on selfishness in Academia:

I’m at the point where I’m jumping up and down in frustration because I can’t get what I want when I want it (which is always yesterday or last week or last year). My resources are limited and dwindling at an alarming rate. My students are swamped with classes and assistantship responsibilities. And yet I’m expected to push out papers. I’m expecting them to push out the papers. And data. Let’s not forget the new data.

I think that the incentives are more aligned here than it appears on first glance. Students will benefit from having a successful professor as a mentor. They benefit from being involved in a lot of successful research projects.

What I suspect the core problem is that academia is set up to make a lot of things urgent (the class you need to teach today, the faculty meeting, the memo you need to write) but they are not especially important. In the long run, success in a biomedical tenure track role requires research output. Other things matter, but this one is key. Students are just as susceptible as professors to getting caught up in things that do have the potential to consume all of ones time. Participating in student goverence, for example, rarely is worth the time that is removes from the student. As a result, everybody feels overwhlemed and tempers fray.

I think the secret here is to realize what are the things that will matter in the long run and what needs to be done adequetely but is not worth a lot of time.