Wednesday, March 13, 2013

Epidemiology and Truth

This post by Thomas Lumley of Stats Chat is well worth reading and thinking carefully about.  In particular, when talking about a study of process meats and mortality he opines:

So, the claims in the results section are about observed differences in a particular data set, and presumably are true. The claim in the conclusion is that this ‘supports’ ‘an association’. If you interpret the conclusion as claiming there is definitive evidence of an effect of processed meat, you’re looking at the sort of claim that is claimed to be 90% wrong. Epidemiologists don’t interpret their literature this way, and since they are the audience they write for, their interpretation of what they mean should at least be considered seriously.


I think that support of an association has to be the most misunderstood piece of Epidemiology (and we epidemiologists are not innocent of this mistake ourselves).  The real issue is that cause is a very tricky animal.  It can be the case that complex disease states have a multitude of "causes".

Consider a very simple (and utterly  artificial) example.  Let assume (no real science went into this example) that hypertension (high systolic blood pressure) occurs when multiple exposures over-whelms a person's ability to compensate for the insult.  So if you have only one exposure off of the list then you are totally fine.  If you have 2 or more then you see elevated blood pressure.  Let's make the list simple: excessive salt intake, sedentary behavior, a high stress work environment, cigarette smoking, and obesity.  Now some of these factors may be correlated, which is its own special problem.

But imagine how hard this would be to disentangle, using either epidemiological methods or personal experimentation.  Imagine two people who work in a high stress job, one of which eats a lot of salt.  They both start a fitness program due to borderline hypertension.  One person sees the disease state vanish whereas the other sees little to no change.  How do you know what was the important factor?

It's easy to look at differences in the exercise program; if you torture the data enough it will confess.  At a population level, you would expect completely different results depending on how many of these factors the underlying population had.  You'd expect, in the long run, to come to some sort of conclusion but it is unlikely that you'd ever stumble across this underlying model using associational techniques. 

The argument continues:
So, how good is the evidence that 90% of epidemiology results interpreted this way are false? It depends. The argument is that most hypotheses about effects are wrong, and that the standard for associations used in epidemiology is not a terribly strong filter, so that most hypotheses that survive the filter are still wrong. That’s reasonably as far as it goes. It does depend on taking studies in isolation. In this example there are both previous epidemiological studies and biochemical evidence to suggest that fat, salt, smoke, and nitrates from meat curing might all be harmful. In other papers the background evidence can vary from strongly in favor to strongly against, and this needs to be taken into account.
 
This points out (correctly) the troubles in just determining an association between A and B.  It's ignoring all of the terrible possibilities -- like A is a marker for something else and not the cause at all.  Even a randomized trial will only tell you that A reduces B as an average causal effect in the source population under study.  It will not tell you why A reduced B.   We can make educated guesses, but we can also be quite wrong.

Finally, there is the whole question of estimation.  If we mean falsehood to be that the size of the average causal effect of intervention A on outcome B is completely unbiased then I submit that 90% is a very conservative estimate (given if you make truth an interval around the point estimate to the precision of the reported estimate given the oddly high number of decimal places people like to quote for fuzzy estimates). 

But that last point kind of falls into the "true but trivial" category . . .


No comments:

Post a Comment