I have heard about the article that Mark references in a previous post; it's hard to be in the epidemiology field and not have heard about it. But, for this post, I want to focus on a single aspect of the problem.
Let's say that you have a rare side effect that requires a large database to find and, even then, the power is limited. Let's say, for an illustration, that the true effect of a drug on an outcome is an Odds Ratio (or Relative Risk, it's a rare disease) of 1.50. If, by chance alone, the estimate in database A is 1.45 (95% Confidence interval: 0.99 to 1.98) and the estimate in database B is 1.55 (95% CI: 1.03 to 2.08) the what would be the result of two studies on this side effect?
Well, if database A is done first then maybe nobody ever looks at database B (these databases are often expensive to use and time consuming to analyze). If database B is used first, the second estimate will be from database A (and thus lower). In fact, there is some chance that the researchers from database A will never publish (as it has been historically the case that null results are hard to publish).
The result? Estimates of association between the drug and the outcome will tend to be biased upwards -- because the initial finding (due to the nature of null results being hard to publish) will tend to be an over-estimate of the true causal effect.
These factors make it hard to determine if a meta-analysis of observational evidence would give an asymptotically unbiased estimate of the "truth" (likely it would be biased upwards).
In that sense, on average, published results are biased to some extent.
Alesina's 'Fair Sare of Abuse'
10 minutes ago