Thursday, June 18, 2020

Be careful what you measure



When evaluating studies, a lot of subtle mistakes can under-cut the conclusions authors claim. Most health studies can be separated into one of two categories. Randomized clinical trials occur when people meeting some criteria are randomized into treatment and control groups. This is widely considered the gold-standard in how to do clinical science. However, it is not always practical or ethical to perform randomized clinical trials, so instead a lot of science is done through observational studies. Because groups being compared are not otherwise identical, measurements are taken, and scientists use models to correct for these differences. Unfortunately, what is measured can be a proxy for what actually is causing the effect of interest, so we need to be careful that when we observe a difference between two groups based on an observable difference. What we base our group separation on may not be causing the differences observed!

Put another way, correlation does not imply causation, however much we wish it did. One example many people find amusing is that Ice Cream sales are correlated with murder rates - both go up in the summer! So even though we can estimate murder rates based on Ice Cream sales, reducing Ice Cream sales will, I believe, have no discernable impact on murder rates.

There are numerous examples where something correlated with something else is measured. In early June, a study by Public Health England showed greatly elevated risks of death due to COVID for Black people and for Bangladeshi people, even after accounting for the effect of sex, age, deprivation and region. To the credit of the report writers, they are quick to point out that the study neither corrected for occupation (Black and Bangladeshi people were both more likely to work in healthcare) nor for comorbidities (which tend to be higher in these minority populations). The headlines are about the disparate impact based on race. While there is certainly a correlation based on race, there is insufficient analysis to conclude that race, rather than occupation or elevated risk factors, is the cause of the difference in outcomes.

A subtler version of this effect can be seen in the Remdesivir study in the New England Journal of Medicine. The study drew from hospitals in 3 continents; patients were given Remdesivir or placebo, but otherwise got the standard of care for the hospital they were admitted to. Hospitals in Asia, and those of Asian ancestry, had the smallest estimated gains from taking Remdesivir. While these differences are not significant, it is interesting to note that what might look like a racial disparity might also be a measure of the different standard care regimens at the participating hospitals.

Do I have any suggestions on how to improve on this? Not really - it is important for scientists to be open and up-front with any shortcomings, rather than having the reader need to play gotcha. Beyond that, these issues are HARD. I’ve made mistakes measuring something other than what I thought I was, even when I could control every aspect of my study. I think the authors of the Public Health England study demonstrated exactly what we hope for - they found relations that were interesting, used models to correct for what they could, and prominently explained the alternate interpretations for what they couldn’t.

No comments:

Post a Comment