Saturday, September 10, 2011

Type M Bias

From the pages of Andrew Gelman:

And classical multiple comparisons procedures—which select at an even higher threshold—make the type M problem worse still (even if these corrections solve other problems). This is one of the troubles with using multiple comparisons to attempt to adjust for spurious correlations in neuroscience. Whatever happens to exceed the threshold is almost certainly an overestimate.

I had never heard of Type M bias before I started following Andrew Gelman's blog. But now I think about it a lot when I do epidemiological studies. I have begun to think we need to have a two stage model: one study to establish an association followed by a replication study to estimate the effect size. I do know that novel associations I find often end up diluted after replication (not that I have that large of an N to work with).

The bigger question is whether the replication study should be bundled with the original effect estimate or if it makes more sense for a different group to look at the question in a separate paper. I like the latter more as it crowd-sources science. But it would be better if the original paper was not often in a far more prestigious journal than the replication study, as the replication study is the one that you would prefer to have the the default source for effect size estimation (and thus should be the easier and higher prestige one to find).


  1. There seems to be little or no incentive for true replication. While the original researchers want to find a large effect, those conducting the replication want a different finding. Otherwise, journals don't seem interested.

    The situation would improve if journals published replications on the basis of how well the original procedures were replicated, regardless of the findings. These replications could be published as very short research notes in the same journal as the original study.

  2. I like the research notes idea. The result of a large and unexpected association showing up would be a strong incentive to publish the follow-up study.

  3. But importantly, these short replications would need to be evaluated by how well they replicate the procedures of the original study, not because the find different results.

    Because academics like to publish, journals should publish short replications, judged only on how well they replicate the procedures of the original study. This offers an incentive to do good replications without suffering from publication bias. It would also give the original researchers some incentive to get things right the first time.