Thursday, March 12, 2009

Missing Data

Is there any issue that is more persistent and more difficult to solve than missing data?

It takes a perfectly good study and layers assumptions on it. There is a clear divide in how to handle it. One option is to argue "why would you not want to use real data" and rejects the assumptions of imputation. Of course, this approach makes it's own set of strong assumptions that are often not likely to be met.

So you'd think that doing the right thing and modeling the missing data is the way to go? Well, it's an improvement but it is pretty rare that the assumptions of missing data technique are met (missing at random is just not accurate in real data).

So what do you do? Most of the time I recommend modeling (inverse probability weighting or multiple imputation) but I must confess that the lack of a solution that is actually good is rather distressing!

No comments:

Post a Comment