Andrew Gelman comments on an article linking genetic diversity (both high and low) with less economic performance than countries with middling levels of diversity. His take-away is quite good:
High-profile social science research aims for proof, not for understanding—and that’s a problem. The incentives favor bold thinking and innovative analysis, and that part is great. But the incentives also favor silly causal claims. In many social sciences, it’s not enough to notice an interesting pattern and explore it (as we did in our Red State Blue State book). Instead, you’re supposed to make a strong causal claim even in a context where it makes little sense.But I also think it omits one piece that is crucial for causal claims: what does a counterfactual look like? This happens a lot with complex phenomenon in both medicine and social science. Just look at the question of whether or not to adjust for variables like blood pressure and cholesterol when estimating the effect of obesity on mortality:
It's possible that most of the thin people who die are meth addicts or have cancer, but even a study which threw out the folks who died within three years of entry into the study found that once you accounted for physical activity*, "underweight" BMIs were correlated with excess mortality risk, while "overweight" BMIs were not. And arguing that the study fails to control for things like blood pressure, blood sugar, and cholesterol seems like fairly weak sauce; those are the very mechanisms by which obesity is supposed to kill us.So what would it mean to make a person thinner and not influence the mediating factors through which the disease operates? It would be a thin person with a lot higher risk of mortality, I suspect. It's the same example as imagining an antihypertensive medication conditioned on blood pressure -- one would suspect that the causal effect of the drug on the participant would be different if it failed in its primary function.
In the same sense, the question of how to change genetic diversity without influencing a lot of other variables is a tricky one. What would it mean for a country whose genetic composition was unrelated to migration to change their level of diversity without changing other factors? What is the mechanism by which we think this operates? Mechanisms are not very important for randomized trials because the design eliminates confounding. But for a non-randomized study, this is a very important piece.
And if we argue that this is just a proxy variables (which seems to be the route that Andrew is taking in his discussion) then the hard causal claims are unecessary. Even worse, they may well obscure factors on which we could imagine basing a strong counter-factual. Exploring data like this is an extremely interesting exercise but I agree that I wish we could admit when we see an interesting pattern that we may not know why this pattern exists.