Comments, observations and thoughts from two bloggers on applied statistics, higher education and epidemiology. Joseph is an associate professor. Mark is a professional statistician and former math teacher.
Thursday, March 18, 2010
The winner's curse
Let's say that you have a rare side effect that requires a large database to find and, even then, the power is limited. Let's say, for an illustration, that the true effect of a drug on an outcome is an Odds Ratio (or Relative Risk, it's a rare disease) of 1.50. If, by chance alone, the estimate in database A is 1.45 (95% Confidence interval: 0.99 to 1.98) and the estimate in database B is 1.55 (95% CI: 1.03 to 2.08) the what would be the result of two studies on this side effect?
Well, if database A is done first then maybe nobody ever looks at database B (these databases are often expensive to use and time consuming to analyze). If database B is used first, the second estimate will be from database A (and thus lower). In fact, there is some chance that the researchers from database A will never publish (as it has been historically the case that null results are hard to publish).
The result? Estimates of association between the drug and the outcome will tend to be biased upwards -- because the initial finding (due to the nature of null results being hard to publish) will tend to be an over-estimate of the true causal effect.
These factors make it hard to determine if a meta-analysis of observational evidence would give an asymptotically unbiased estimate of the "truth" (likely it would be biased upwards).
In that sense, on average, published results are biased to some extent.
A lot to discuss
“There is increasing concern,” declared epidemiologist John Ioannidis in a highly cited 2005 paper in PLoS Medicine, “that in modern research, false findings may be the majority or even the vast majority of published research claims.”Ioannidis claimed to prove that more than half of published findings are false, but his analysis came under fire for statistical shortcomings of its own. “It may be true, but he didn’t prove it,” says biostatistician Steven Goodman of the Johns Hopkins University School of Public Health. On the other hand, says Goodman, the basic message stands. “There are more false claims made in the medical literature than anybody appreciates,” he says. “There’s no question about that.”
Nobody contends that all of science is wrong, or that it hasn’t compiled an impressive array of truths about the natural world. Still, any single scientific study alone is quite likely to be incorrect, thanks largely to the fact that the standard statistical system for drawing conclusions is, in essence, illogical. “A lot of scientists don’t understand statistics,” says Goodman. “And they don’t understand statistics because the statistics don’t make sense.”
Wednesday, March 17, 2010
Evidence
Is there a measure in all of medical research more controversial than the p-value? Sometimes I really don't think so. In a lot of ways, it seems to dominate research just because it has become an informal standard. But it felt odd, the one time I did it, to say in a paper that there was no association (p=.0508) when adding a few more cases might have flipped the answer.
I don't think confidence intervals, used in the sense of "does this interval include the null", really advance the issue either. But it's true that we do want a simple way to decide if we should be concerned about a possible adverse association and the medical literature is not well constructed for a complex back and through discussion about statistical models.
I'm also not convinced that any "standard of evidence" would not be similarly misapplied. Any approach that is primarily used by trained statisticians (sensitive to it's limitations) will look good compared with a broad standard that is also applied by non-specialists.
So I guess I don't see an easy way to replace our reliance on p-values in the medical literature, but it is worth some thought.
"We could call them 'universities'"
In the end, [Diane Ravitch's] Death and Life is painfully short on non-curricular ideas that might actually improve education for those who need it most. The last few pages contain nothing but generalities: ... "Teachers must be well educated and know their subjects." That's all on page 238. The complete lack of engagement with how to do these things is striking.
If only there were a system of institutions where teachers could go for instruction in their fields. If there were such a system then Dr. Ravitch could say "Teachers must be well educated and know their subjects" and all reasonable people would assume that she meant we should require teachers to take more advanced courses and provide additional compensation for those who exceeded those requirements.
Tuesday, March 16, 2010
Some context on schools and the magic of the markets
When Ben Wildavsky said "Perhaps most striking to me as I read Death and Life was Ravitch’s odd aversion to, even contempt for, market economics and business as they relate to education" he wasn't wasting his time on a minor aspect of the book; he was focusing on the fundamental principle of the debate.
The success or even the applicability of business metrics and mission statements in education is a topic for another post, but the subject does remind me of a presentation the head of the education department gave when I was getting my certification in the late Eighties. He showed us a video of Tom Peter's discussing In Search of Excellence then spent about an hour extolling Peters ideas.
(on a related note, I don't recall any of my education classes mentioning George Polya)
I can't say exactly when but by 1987 business-based approaches were the big thing in education and had been for quite a while, a movement that led to the introduction of charter schools at the end of the decade. And the movement has continued to this day.
In other words, American schools have been trying a free market/business school approach for between twenty-five and thirty years.
I'm not going to say anything here about the success or failure of those efforts, but it is worth putting in context.
Monday, March 15, 2010
And for today, at least, you are not the world's biggest math nerd
Fun fact of the day: MIT releases its undergraduate admission decisions at 1:59 pm today. (That is, at 3.14159).
Who is this Thomas Jefferson you keep talking about?
Here's the change that really caught my eye:
Thomas Jefferson no longer included among writers influencing the nation’s intellectual origins. Jefferson, a deist who helped pioneer the legal theory of the separation of church and state, is not a model founder in the board’s judgment. Among the intellectual forerunners to be highlighted in Jefferson’s place: medieval Catholic philosopher St. Thomas Aquinas, Puritan theologian John Calvin and conservative British law scholar William Blackstone. Heavy emphasis is also to be placed on the founding fathers having been guided by strict Christian beliefs.* I'm a Texan by birth. I'm allowed to mock.
Observational Research
With cross sectional studies, on the other hand, reverse causality is always a concern.
Of course, the other trick is that the risk factor really has to be modifiable. Drugs (my own favorite example) often are. But even diet and exercise get tricky to modify when you look at them closely (as they are linked to other characteristics of the individual and are a very drastic change in lifestyle patterns).
It's a hard area and this is is why we use experiments as our gold standard!
"The Obesity-Hunger Paradox"
WHEN most people think of hunger in America, the images that leap to mind are of ragged toddlers in Appalachia or rail-thin children in dingy apartments reaching for empty bottles of milk.
Once, maybe.
But a recent survey found that the most severe hunger-related problems in the nation are in the South Bronx, long one of the country’s capitals of obesity. Experts say these are not parallel problems persisting in side-by-side neighborhoods, but plagues often seen in the same households, even the same person: the hungriest people in America today, statistically speaking, may well be not sickly skinny, but excessively fat.
Call it the Bronx Paradox.
“Hunger and obesity are often flip sides to the same malnutrition coin,” said Joel Berg, executive director of the New York City Coalition Against Hunger. “Hunger is certainly almost an exclusive symptom of poverty. And extra obesity is one of the symptoms of poverty.”
The Bronx has the city’s highest rate of obesity, with residents facing an estimated 85 percent higher risk of being obese than people in Manhattan, according to Andrew G. Rundle, an epidemiologist at the Mailman School of Public Health at Columbia University.
But the Bronx also faces stubborn hunger problems. According to a survey released in January by the Food Research and Action Center, an antihunger group, nearly 37 percent of residents in the 16th Congressional District, which encompasses the South Bronx, said they lacked money to buy food at some point in the past 12 months. That is more than any other Congressional district in the country and twice the national average, 18.5 percent, in the fourth quarter of 2009.
Such studies present a different way to look at hunger: not starving, but “food insecure,” as the researchers call it (the Department of Agriculture in 2006 stopped using the word “hunger” in its reports). This might mean simply being unable to afford the basics, unable to get to the grocery or unable to find fresh produce among the pizza shops, doughnut stores and fried-everything restaurants of East Fordham Road.
"The economics profession is in crisis"
From the Economist's View:
The fact that the evidence always seems to confirm ideological biases doesn't give much confidence. Even among the economists that I trust to be as fair as they can be -- who simply want the truth whatever it might be (which is most of them) -- there doesn't seem to be anything resembling convergence on this issue. In my most pessimistic moments, I wonder if we will ever make progress, particularly since there seems to be a tendency for the explanation given by those who are most powerful in the profession to stick just because they said it. So long as there is some supporting evidence for their positions, evidence pointing in other directions doesn't seem to matter.The economics profession is in crisis, more so than the leaders in the profession seem to understand (since change might upset their powerful positions, positions that allow them to control the academic discourse by, say, promoting one area of research or class of models over another, they have little incentive to see this). If, as a profession, we can't come to an evidence based consensus on what caused the single most important economic event in recent memory, then what do we have to offer beyond useless "on the one, on the many other hands" explanations that allow people to pick and choose according to their ideological leanings? We need to do better.
(forgot to block-quote this. sorry about the error)
TNR on the education debate
Here are the first three entries:
By Diane Ravitch: The country's love affair with standardized testing and charter schools is ruining American education.
By Ben Wildavsky: Why Diane Ravitch's populist rage against business-minded school reform doesn't make sense.
By Richard Rothstein: Ravitch’s recent ‘conversion’ is actually a return to her core values.
Sunday, March 14, 2010
Harlem Children's Zero Sum Game
On a completely unrelated note, here's an interesting article from the New York Times:
Pressed by Charters, Public Schools Try Marketing
By JENNIFER MEDINA
Rafaela Espinal held her first poolside chat last summer, offering cheese, crackers and apple cider to draw people to hear her pitch.
She keeps a handful of brochures in her purse, and also gives a few to her daughter before she leaves for school each morning. She painted signs on the windows of her Chrysler minivan, turning it into a mobile advertisement.
It is all an effort to build awareness for her product, which is not new, but is in need of an image makeover: a public school in Harlem.
As charter schools have grown around the country, both in number and in popularity, public school principals like Ms. Espinal are being forced to compete for bodies or risk having their schools closed. So among their many challenges, some of these principals, who had never given much thought to attracting students, have been spending considerable time toiling over ways to market their schools. They are revamping school logos, encouraging students and teachers to wear T-shirts emblazoned with the new designs. They emphasize their after-school programs as an alternative to the extended days at many charter schools. A few have worked with professional marketing firms to create sophisticated Web sites and blogs.
...
For most schools, the marketing amounts to less than $500, raised by parents and teachers to print up full color postcards or brochures. Typically, principals rely on staff members with a creative bent to draw up whatever they can.
Student recruitment has always been necessary for charter schools, which are privately run but receive public money based on their enrollment, supplemented by whatever private donations they can corral.
The Harlem Success Academy network, run by the former City Council member Eva Moskowitz, is widely regarded, with admiration by some and scorn by others, as having the biggest marketing effort. Their bright orange advertisements pepper the bus stops in the neighborhood, and prospective parents receive full color mailings almost monthly.
Ms. Moskowitz said the extensive outreach was necessary to make sure they were drawing from a broad spectrum of parents. Ms. Moskowitz said they spent roughly $90 per applicant for recruitment. With about 3,600 applicants last year for the four schools in the network, she said, the total amounted to $325,000.
Saturday, March 13, 2010
Social norms and happy employees
About 10 years ago I was having my annual holiday party, and my niece had come with her newly minted M.B.A. boyfriend. As he looked around the room, he noted that my employees seemed happy. I told him that I thought they were.
Then, figuring I would take his new degree for a test drive, I asked him how he thought I did that. “I’m sure you treat them well,” he replied.
“That’s half of it,” I said. “Do you know what the other half is?”
He didn’t have the answer, and neither have the many other people that I have told this story. So what is the answer? I fired the unhappy people. People usually laugh at this point. I wish I were kidding.
In my experience, it is generally unhappy employees who say things like "But what happens to our business model if home prices go down?" or "Doesn't that look kinda like an iceberg?" Putting that aside, though, this is another example of the principle discussed in the last post -- it's easy to get the norms you want if you can decide who goes in the group.
Charter schools, social norming and zero-sum games
Maybe I can provide some reasons.
First off, this is an observational study, not a randomized experiment. I think we may be reaching the limits of what analysis of observational data can do in the education debate and, given the importance and complexity of the questions, I don't understand why we aren't employing randomized trials to answer some of these questions once and for all.
More significantly I'm also troubled by the aliasing of data on the Promise Academies and by the fact that the authors draw a conclusion ("HCZ is enormously successful at boosting achievement in math and ELA in elementary school and math in middle school. The impact of being offered admission into the HCZ middle school on ELA achievement is positive, but less dramatic. High-quality schools or community investments coupled with high-quality schools drive these results, but community investments alone cannot.") that the data can't support.
In statistics, aliasing means combining treatments in such a way that you can't tell which treatment or combination of treatments caused the effect you observed. In this case the first treatment is the educational environment of the Promise Academies. The second is something called social norming.
When you isolate a group of students, they will quickly arrive at a consensus of what constitutes normal behavior. It is a complex and somewhat unpredictable process driven by personalities and random connections and any number of outside factors. You can however, exercise a great deal of control over the outcome by restricting the make-up of the group.
If we restricted students via an application process, how would we expect that group to differ from the general population and how would that affect the norms the group would settle on? For starters, all the parents would have taken a direct interest in their children's schooling.
Compared to the general population, the applicants will be much more likely to see working hard, making good grades, not getting into trouble as normal behaviors. The applicants (particularly older applicants) would be more likely to be interested in school and to see academic and professional success as a reasonable possibility because they would have made an active choice to move to a new and more demanding school. Having the older students committed to the program is particularly important because older children play a disproportionate role in the setting of social norms.
Dobbie and Fryer address the question of self-selection, "[R]esults from any lottery sample may lack external validity. The counterfactual we identify is for students who are already interested in charter schools. The effect of being offered admission to HCZ for these students may be different than for other types of students." In other words, they can't conclude from the data how well students would do at the Promise Academies if, for instance, their parents weren't engaged and supportive (a group effective eliminated by the application process).
But there's another question, one with tremendous policy implications, that the paper does not address: how well would the students who were accepted to HCZ have done if they were given the same amount of instruction * as they would have received from HCZ using public school teachers while being isolated from the general population? (There was a control group of lottery losers but there is no evidence that they were kept together as a group.)
Why is this question so important? Because we are thinking about spending an enormous amount of time, effort and money on a major overhaul of the education system when we don't have the data to tell us if what we'll spend will wasted or, worse yet, if we are to some extent playing a zero sum game.
Social norming can work both ways. If you remove all of the students whose parents are willing and able to go through the application process, the norms of acceptable behavior for those left behind will move in an ugly direction and the kids who started out with the greatest disadvantages would be left to bear the burden.
But we can answer these questions and make decisions based on solid, statistically sound data. Educational reform is not like climate change where observational data is our only reasonable option. Randomized trials are an option in most cases; they are not that difficult or expensive.
Until we get good data, how can we expect to make good decisions?
* Correction: There should have been a link here to this post by Andrew Gelman.
Friday, March 12, 2010
Instrumental variables
This is a important dilemma. Alan Brookhart, who introduced them into phamracoepidemiology in 2006, has done an amazing job of proving out one example. But you can't generalize from one example and the general idea of using physician preference as an instrument, while really cool, suffers from these assumptions.
Unlike unmeasured confounders, it's hard to know how to test this. With unmeasured confounders you can ask critics to specify what they suspect might be the key confounding factors and go forth and measure them. But instruments are used precisely when there is a lack of data.
I've done some work in the area with some amazing colleagues and I still think that the idea has some real promise. It's a novel idea that really came out of left field and has enormous potential. But I want to understand it in far more actual cases before I conclude much more . . .