Friday, April 8, 2022

When the methods make you worried

This is Joseph

Thomas Lumley has a great piece on a recently published article. Go read that first, we'll still be here. In it he looks at a paper that is an ecological study of cannabis and cancer. Thomas probably got the most interesting parts, but I want to focus on two things that are in a professional paper.

In Table 6 we are presented with 3 measures of relative risk (presumably of cannabis use and prostate cancer). Relative risk is not mentioned in the text of the paper and is conventionally defined as:
Relative risk is a ratio of the probability of an event occurring in the exposed group versus the probability of the event occurring in the non-exposed group. 

The three relative risks proposed are: 1.25,1.66E+ 52, and 5.65E+ 05. No, the second two are not jokes. A relative risk of 1.25 is plausible, if hard to prove with a tricky outcome like cancer. But something went quite wrong with the other two. It might have been a hint when this was in the acknowledgements:

We wish to acknowledge with grateful thanks the work of Professor Mark Stevenson in modifying and enlarging the capacity of “epiR” to handle the enormous integers encountered in this study. His prompt and timely assistance is greatly appreciated indeed.

 The other issue is with E-values. An E-value is defined as: 

The E-value is defined as the minimum strength of association, on the risk ratio scale, that an unmeasured confounder would need to have with both the treatment and the outcome to fully explain away a specific treatment-outcome association, conditional on the measured covariates. A large E-value implies that considerable unmeasured confounding would be needed to explain away an effect estimate. A small E-value implies little unmeasured confounding would be needed to explain away an effect estimate. 

But the authors slide this into other forms of bias in a rather interesting way:

Furthermore this was also an ecological study. It is therefore potentially susceptible to the short-comings typical of ecological studies including the ecological fallacy and selection and information biases. Within the present paper we began to address these issues with the use of E-values in all Tables.

Now, E-values are quite useful for thinking about confounding but I am quite interested to see how they apply to things like selection bias, except insofar as a larger estimate requires more substantive bias to be entirely artifactual due to selection bias. The other stuff, about biological plausibility, is fine and that is the reason to have conducted the study. But the E-values say nothing about bias other than how strong a set of confounders would need to be to induce a spurious association. 

Also this is not quite true:

Causal inference was addressed in two ways. Firstly inverse probability weighting (IPW) was conducted on all mixed effects, robust and panel models which had the effect of equilibrating exposure across all observed groups. IPW were computed from the R-package “ipw”. Inverse probability weighting transforms an observational dataset into a pseudo-randomized dataset so that it becomes appropriate to draw inferences as to truly causal relationships.

This is only true under a set of assumptions,  

One, there is no unmeasured confounding present. Two, that the marginal structural model is correctly specified (both the marginal structural model and   the   model   for   exposure).   Three,   that   each participant’s    counterfactual    outcome    under    their observed   exposure   is   the   same  as   their   observed outcome  (consistency).  Finally,  we  need  to  assume positivity — that   there   are   exposed   and   unexposed participants  at  all  levels  of  the  confounders.  If  these assumptions are not met, then the marginal structural model may give misleading estimate

IPW can be a powerful and useful technique, but it is hardly magic and cannot replace randomized data in causal inference. The use of E-values immediately casts doubt on the first assumption being fully realistic.  

Now there may be some important information in this ecological study, but I find these epidemiological methods issues really detract from the potentially good science elsewhere in the paper.

No comments:

Post a Comment