Thursday, January 22, 2015

“Epidemiology and Biostatistics: competitive or complementary?”

From Andrew Gelman:
To return to epidemiology vs. biostatistics: it’s my impression that there’s a lot of forward causal inference and a lot of reverse causal inference in both fields. That is, researchers spend a lot of time trying to estimate particular causal effects (“forward causal inference”) and a lot of time trying to uncover the causes of phenomena (“reverse causal questioning”).

And, from my perspective (as elaborated in that paper with Guido), these two tasks are fundamentally different and are approached differently: forward causal inference is done via estimation within a model, whereas reverse causal questioning is an elaboration of model checking, exploring aspects of data that are not explained by existing theories.
It is an interesting question.  I have a very different opinion here (perhaps unsurprisingly) that what we really have is an emphasis on two different pieces of a hard problem.  Both disciplines are looking at the difficult problem of disease (or health) in humans -- where experiments are expensive, awkward, and often infeasible.  It is a very hard problem!

Usually, the practical difference I see is where the researchers starts with the questions.  As a generality, an epidemiologists seems to start by thinking about the disease while a biostatistician tries to think about how to do valid measures in a complex system.  Obviously, if you are ever going to solve these problems you are going to need both approaches.  The day of the very simple epidemiological intervention (John Snow) are likely long past. 

It's also the case that simple attempts to measure  interventions often fail.  Clinical trials use a placebo arm because selection into the trial makes the trial group non-comparable in exceedingly difficult to measure ways.  So you generally have to be good at both tasks for an observational study to be useful. 

But there is also a huge amount of epidemiology that is utterly non-causal.  Why?  Because you cannot come up with testable hypotheses without understanding how different elements relate to one another.  This is what we are doing with the classic case-control study, or at least where I see it as being the most useful.  If you have no idea what causes a rare cancer, looking at what makes the people with cancer different is where you start finding ideas. 

However, these descriptive analyses are not going to give great insight into the cause of the outcome.  For example, mortality has a u-shaped curve with body mass index (many wonderful papers on this phenomenon).  But it is utterly unclear that interventions would help.  Maybe the mortality with low BMI is due to wasting disease, so adding weight if you are thin (and not dying of a disease already) may have no benefit.  Similarly, we don't know what changing a BMI from 45 to 25 would do for a 35 year old.  But if we don't understand the patterns, we can't really target interventions and develop testable hypotheses that are likely to yield important answers. 

So I am often quite serious about looking for "associations" in a lot of contexts.  When we look at the correlated with cardiovascular disease (inflammation or coronary plaques), we are looking for things that you could use to develop a causal hypothesis.  In other contexts, I am interested in using the association as a proxy for a causal effect (unintended drug side effects fall in this category -- when I say warfarin is associated with bleeding, I am really thinking that warfarin causes bleeding). 

And people wonder why epidemiologists are often happy to be confused as "skin doctors".  :-) 

No comments:

Post a Comment