Sunday, March 21, 2010

Interesting variable taxation idea from Thoma

From Economist's View:
Political battles make it very difficult to use discretionary fiscal policy to fight a recession, so more automatic stabilizers are needed. Along those lines, if something like this were to be implemented to stabilize the economy over the business cycle, I'd prefer to do this more generally, i.e. allow income taxes, payroll taxes, etc. to vary procyclically. That is, these taxes would be lower in bad times and higher when things improve, and implemented through an automatic moving average type of rule that produces the same revenue as some target constant tax rate (e.g. existing rates).

Saturday, March 20, 2010

New Proposed National Math Standards

These actually look pretty good.

Friday, March 19, 2010

Too late for an actual post, but...

There are another couple of entries in the TNR education debate. If you're an early riser you can read them before I do.

Thursday, March 18, 2010

Some more thoughts on p-value

One of the advantages of being a corporate statistician was that generally you not only ran the test; you also explained the statistics. I could tell the department head or VP that a p-value of 0.08 wasn't bad for a preliminary study with a small sample, or that a p-value of 0.04 wasn't that impressive with a controlled study of a thousand customers. I could factor in things like implementation costs and potential returns when looking at type-I and type-II errors. For low implementation/high returns, I might set significance at 0.1. If the situation were reversed, I might set it at 0.01.

Obviously, we can't let everyone set their own rules, but (to coin a phrase) I wonder if in an effort to make things as simple as possible, we haven't actually made them simpler. Statistical significance is an arbitrary, context-sensitive cut-off that we assign before a test based on the relative costs of a false positive and a false negative. It is not a God-given value of 5%.
Letting everyone pick their own definition of significance is a bad idea but so is completely ignoring context. Does it make any sense to demand the same level of p-value from a study of a rare, slow-growing cancer (where five-years is quick and a sample size of 20 is an achievement) and a drug to reduce BP in the moderately obese (where a course of treatment lasts two week and the streets are filled with potential test subjects)? Should we ignore a promising preliminary study because it comes in at 0.06?

For a real-life example, consider the public reaction to the recent statement that we didn't have statistically significant data that the earth had warmed over the past 15 years. This was a small sample and I'm under the impression that the results would have been significant at the 0.1 level, but these points were lost (or discarded) in most of the coverage.

We need to do a better job dealing with these grays. We might try replacing the phrase "statistically significant" with "statistically significant at 10/5/1/0.1%." Or we might look at some sort of a two-tiered system, raising significance to 0.01 for most studies while making room for "provisionally significant" papers where research is badly needed, adequate samples are not available, or the costs of a type-II error are deemed unusually high.

I'm not sure how practical or effective these steps might be but I am sure we can do better. Statisticians know how to deal with gray areas; now we need to work on how we explain them.

For more on the subject, check out Joseph's posts here and here.

The winner's curse

I have heard about the article that Mark references in a previous post; it's hard to be in the epidemiology field and not have heard about it. But, for this post, I want to focus on a single aspect of the problem.

Let's say that you have a rare side effect that requires a large database to find and, even then, the power is limited. Let's say, for an illustration, that the true effect of a drug on an outcome is an Odds Ratio (or Relative Risk, it's a rare disease) of 1.50. If, by chance alone, the estimate in database A is 1.45 (95% Confidence interval: 0.99 to 1.98) and the estimate in database B is 1.55 (95% CI: 1.03 to 2.08) the what would be the result of two studies on this side effect?

Well, if database A is done first then maybe nobody ever looks at database B (these databases are often expensive to use and time consuming to analyze). If database B is used first, the second estimate will be from database A (and thus lower). In fact, there is some chance that the researchers from database A will never publish (as it has been historically the case that null results are hard to publish).

The result? Estimates of association between the drug and the outcome will tend to be biased upwards -- because the initial finding (due to the nature of null results being hard to publish) will tend to be an over-estimate of the true causal effect.

These factors make it hard to determine if a meta-analysis of observational evidence would give an asymptotically unbiased estimate of the "truth" (likely it would be biased upwards).

In that sense, on average, published results are biased to some extent.

A lot to discuss

When you get past the inflammatory opening, this article in Science News is something you should take a look at (via Felix Salmon).
“There is increasing concern,” declared epidemiologist John Ioannidis in a highly cited 2005 paper in PLoS Medicine, “that in modern research, false findings may be the majority or even the vast majority of published research claims.”

Ioannidis claimed to prove that more than half of published findings are false, but his analysis came under fire for statistical shortcomings of its own. “It may be true, but he didn’t prove it,” says biostatistician Steven Goodman of the Johns Hopkins University School of Public Health. On the other hand, says Goodman, the basic message stands. “There are more false claims made in the medical literature than anybody appreciates,” he says. “There’s no question about that.”

Nobody contends that all of science is wrong, or that it hasn’t compiled an impressive array of truths about the natural world. Still, any single scientific study alone is quite likely to be incorrect, thanks largely to the fact that the standard statistical system for drawing conclusions is, in essence, illogical. “A lot of scientists don’t understand statistics,” says Goodman. “And they don’t understand statistics because the statistics don’t make sense.”

Wednesday, March 17, 2010

Evidence

I was reading Andrew Gelman (always a source of interesting statistical thoughts) and I started thinking about p-values in epidemiology.

Is there a measure in all of medical research more controversial than the p-value? Sometimes I really don't think so. In a lot of ways, it seems to dominate research just because it has become an informal standard. But it felt odd, the one time I did it, to say in a paper that there was no association (p=.0508) when adding a few more cases might have flipped the answer.

I don't think confidence intervals, used in the sense of "does this interval include the null", really advance the issue either. But it's true that we do want a simple way to decide if we should be concerned about a possible adverse association and the medical literature is not well constructed for a complex back and through discussion about statistical models.

I'm also not convinced that any "standard of evidence" would not be similarly misapplied. Any approach that is primarily used by trained statisticians (sensitive to it's limitations) will look good compared with a broad standard that is also applied by non-specialists.

So I guess I don't see an easy way to replace our reliance on p-values in the medical literature, but it is worth some thought.

"We could call them 'universities'"

This bit from the from Kevin Carey's entry into the New Republic Debate caught my eye:

In the end, [Diane Ravitch's] Death and Life is painfully short on non-curricular ideas that might actually improve education for those who need it most. The last few pages contain nothing but generalities: ... "Teachers must be well educated and know their subjects." That's all on page 238. The complete lack of engagement with how to do these things is striking.

If only there were a system of institutions where teachers could go for instruction in their fields. If there were such a system then Dr. Ravitch could say "Teachers must be well educated and know their subjects" and all reasonable people would assume that she meant we should require teachers to take more advanced courses and provide additional compensation for those who exceeded those requirements.

Tuesday, March 16, 2010

Some context on schools and the magic of the markets

One reason emotions run so hot in the current debate is that the always heated controversies of education have somehow become intertwined with sensitive points of economic philosophy. The discussion over child welfare and opportunity has been rewritten as an epic struggle between big government and unions on one hand and markets and entrepreneurs on the other. (insert Lord of the Rings reference here)

When Ben Wildavsky said "Perhaps most striking to me as I read Death and Life was Ravitch’s odd aversion to, even contempt for, market economics and business as they relate to education" he wasn't wasting his time on a minor aspect of the book; he was focusing on the fundamental principle of the debate.

The success or even the applicability of business metrics and mission statements in education is a topic for another post, but the subject does remind me of a presentation the head of the education department gave when I was getting my certification in the late Eighties. He showed us a video of Tom Peter's discussing In Search of Excellence then spent about an hour extolling Peters ideas.

(on a related note, I don't recall any of my education classes mentioning George Polya)

I can't say exactly when but by 1987 business-based approaches were the big thing in education and had been for quite a while, a movement that led to the introduction of charter schools at the end of the decade. And the movement has continued to this day.

In other words, American schools have been trying a free market/business school approach for between twenty-five and thirty years.

I'm not going to say anything here about the success or failure of those efforts, but it is worth putting in context.

Monday, March 15, 2010

And for today, at least, you are not the world's biggest math nerd

From Greg Mankiw:
Fun fact of the day: MIT releases its undergraduate admission decisions at 1:59 pm today. (That is, at 3.14159).

Who is this Thomas Jefferson you keep talking about?

I've got some posts coming up on the role curriculum plays in educational reform. In the meantime, check out what's happening in Texas* with the state board of education. Since the Lone Star state is such a big market they have a history of setting textbook content for the nation.

Here's the change that really caught my eye:
Thomas Jefferson no longer included among writers influencing the nation’s intellectual origins. Jefferson, a deist who helped pioneer the legal theory of the separation of church and state, is not a model founder in the board’s judgment. Among the intellectual forerunners to be highlighted in Jefferson’s place: medieval Catholic philosopher St. Thomas Aquinas, Puritan theologian John Calvin and conservative British law scholar William Blackstone. Heavy emphasis is also to be placed on the founding fathers having been guided by strict Christian beliefs.
* I'm a Texan by birth. I'm allowed to mock.

Observational Research

An interesting critique of observational data by John Cook. I think that the author raises an interesting point but that it is more true of cross-sectional studies than longitudinal ones. If you have a baseline modifiable factor and look at the predictors of change then you have a pretty useful measure of consequence. It might be confounded or it might have issues with indication bias, but it's still a pretty interesting prediction.

With cross sectional studies, on the other hand, reverse causality is always a concern.

Of course, the other trick is that the risk factor really has to be modifiable. Drugs (my own favorite example) often are. But even diet and exercise get tricky to modify when you look at them closely (as they are linked to other characteristics of the individual and are a very drastic change in lifestyle patterns).

It's a hard area and this is is why we use experiments as our gold standard!

"The Obesity-Hunger Paradox"

Interesting article from the New York Times:

WHEN most people think of hunger in America, the images that leap to mind are of ragged toddlers in Appalachia or rail-thin children in dingy apartments reaching for empty bottles of milk.

Once, maybe.

But a recent survey found that the most severe hunger-related problems in the nation are in the South Bronx, long one of the country’s capitals of obesity. Experts say these are not parallel problems persisting in side-by-side neighborhoods, but plagues often seen in the same households, even the same person: the hungriest people in America today, statistically speaking, may well be not sickly skinny, but excessively fat.

Call it the Bronx Paradox.

“Hunger and obesity are often flip sides to the same malnutrition coin,” said Joel Berg, executive director of the New York City Coalition Against Hunger. “Hunger is certainly almost an exclusive symptom of poverty. And extra obesity is one of the symptoms of poverty.”

The Bronx has the city’s highest rate of obesity, with residents facing an estimated 85 percent higher risk of being obese than people in Manhattan, according to Andrew G. Rundle, an epidemiologist at the Mailman School of Public Health at Columbia University.

But the Bronx also faces stubborn hunger problems. According to a survey released in January by the Food Research and Action Center, an antihunger group, nearly 37 percent of residents in the 16th Congressional District, which encompasses the South Bronx, said they lacked money to buy food at some point in the past 12 months. That is more than any other Congressional district in the country and twice the national average, 18.5 percent, in the fourth quarter of 2009.

Such studies present a different way to look at hunger: not starving, but “food insecure,” as the researchers call it (the Department of Agriculture in 2006 stopped using the word “hunger” in its reports). This might mean simply being unable to afford the basics, unable to get to the grocery or unable to find fresh produce among the pizza shops, doughnut stores and fried-everything restaurants of East Fordham Road.

"The economics profession is in crisis"

This may sound strange but all this soul searching by economists like Mark Thoma makes me think that the field might be on the verge of extensive reassessment and major advances.

From the Economist's View:
The fact that the evidence always seems to confirm ideological biases doesn't give much confidence. Even among the economists that I trust to be as fair as they can be -- who simply want the truth whatever it might be (which is most of them) -- there doesn't seem to be anything resembling convergence on this issue. In my most pessimistic moments, I wonder if we will ever make progress, particularly since there seems to be a tendency for the explanation given by those who are most powerful in the profession to stick just because they said it. So long as there is some supporting evidence for their positions, evidence pointing in other directions doesn't seem to matter.

The economics profession is in crisis, more so than the leaders in the profession seem to understand (since change might upset their powerful positions, positions that allow them to control the academic discourse by, say, promoting one area of research or class of models over another, they have little incentive to see this). If, as a profession, we can't come to an evidence based consensus on what caused the single most important economic event in recent memory, then what do we have to offer beyond useless "on the one, on the many other hands" explanations that allow people to pick and choose according to their ideological leanings? We need to do better.

(forgot to block-quote this. sorry about the error)

TNR on the education debate

The New Republic is starting a series on education reform. Given the extraordinary quality of commentary we've been seeing from TNR, this is definitely a good development.

Here are the first three entries:

By Diane Ravitch: The country's love affair with standardized testing and charter schools is ruining American education.

By Ben Wildavsky: Why Diane Ravitch's populist rage against business-minded school reform doesn't make sense.

By Richard Rothstein: Ravitch’s recent ‘conversion’ is actually a return to her core values.