Wednesday, September 9, 2020

Did Sturgis really cause 19 percent of all U.S. COVID cases in August? A skeptical response

I want to make something very clear at the beginning of this post so that it is not misconstrued: I think COVID is a very serious problem that our country has failed miserably to control. I think the motorcycle rally that brought hundreds of thousands of people to Sturgis, SD was a very bad thing and it was incredibly irresponsible and dangerous to hold the rally this year. The point of this post is not to defend the Sturgis rally. I am emphatically not defending the rally. My purpose is more to look at how popular media report research findings and how people tend to believe and uncritically share stories that support views they already hold.

In the last couple days I'm seeing the story about the Sturgis motorcycle rally held August 7-16 being a COVID super-spreader event all over the place. The big headline that everyone is sharing is that the rally has led to more than 250,000 COVID cases nationwide. An article from Mother Jones states, "According to a new study, which tracked anonymized cellphone data from the rally, over 250,000 coronavirus cases have now been tied to the 10-day event, one of the largest to be held since the start of the pandemic."

It is not true that over 250,000 cases have been tied to the event. Let's take a look at the origins of this claim.

The original study, The Contagion Externality of a Superspreading Event: The Sturgis Motorcycle Rally and COVID-19, is from the "Discussion Paper Series" of the IZA - Institute of Labor Economics. I note that the article states, "IZA Discussion Papers often represent preliminary work and are circulated to encourage discussion. Citation of such a paper should account for its provisional character. A revised version may be available directly from the author." As far as I can tell this study has not undergone formal peer review. Now, not going through peer review does not mean something is wrong, just as going through peer review does not mean something is right. But it is, I think, a factor in how much weight should be given to the conclusions.

The study, the authors of which are four economists, uses anonymized cell phone data to show travel of people to Sturgis from the surrounding areas and elsewhere in the country, and looks at changes in COVID case rates before and after the rally in those same areas. (This is a very oversimplified explanation. If you want to see all the details, read the paper.)

The conclusion that the rally caused a large increase in cases in Sturgis and adjacent counties, and even in the state of South Dakota as a whole, is one that I feel very comfortable accepting after reading the paper. I won't spend time here going over the evidence for those findings. And there's no doubt that attendees who traveled from elsewhere in the country brought COVID back home with them. No doubt at all. But what's the evidence for that "250,000 cases" conclusion?

In the discussion section of the paper, the authors state, "In counties with the largest relative inflow to the event, the per 1,000 case rate increased by 10.7 percent after 24 days following the onset of Sturgis Pre-Rally Events. Multiplying the percent case increases for the high, moderate-high and moderate inflow counties by each county’s respective pre-rally cumulative COVID-19 cases and aggregating, yields a total of 263,708 additional cases in these locations due to the Sturgis Motorcycle Rally."

Basically, the authors observed that in counties with more travel to Sturgis, COVID increased more after the rally, and then they did some multiplication and summation based on those percentage increases to arrive at the >250,000 cases estimate.

Let's look at the underlying data.

This is the figure demonstrating that COVID increased more in counties with high travel to the rally:


By the way, the figures in this paper have no figure legends, which is irritating. I'm interpreting the figure based on what is written in the results section.

The vertical axes on the plots show relative changes in COVID case numbers, the horizontal axes show time in days with the red vertical line indicating the start of the rally, and the panels from (a) to (e) go from counties with the highest relative travel to Sturgis to counties with the lowest relative travel to Sturgis.

One thing to note here is that almost all the points on the graphs have large error bars, representing the degree of uncertainty in the estimates. Most of the error bars overlap the zero line. Now, the fact that the error bars overlap the zero line does not mean there's no evidence that the increases are real. But it does mean that we cannot state with an extremely high degree of certainty that the increases are real.

To get the estimate of "263,708 additional cases" the authors put in numbers that had very high degrees of uncertainty, did some math, and got out a point estimate that is presented by itself without any uncertainty. But because the underlying numbers had so much uncertainty, that estimate also has a huge amount of uncertainty! Articles that say over 250,000 cases have been "tied to" the event are ignoring this.

And there's another thing I notice about these graphs. If you look at the trends of numbers from the leftmost point on each graph through the first point to the right of the red vertical line (which would be too early for the rally to have had a measurable effect on COVID numbers), you can see that for the top three panels, counties with travel to Sturgis ranging from high to moderate, it appears the COVID curves were already bending upward. And then they continued to bend upward more sharply. For the last panel, counties with low travel to Sturgis, the COVID numbers were on a steady downward trend, and then continued on that steady downward trend.

An alternate interpretation of the data in this figure is not that Sturgis caused the nationwide increases, but rather that people who live in places that were doing a worse job keeping COVID under control in August are more likely to have traveled to Sturgis for the rally. Which, intuitively, would make sense, wouldn't it?

I'm absolutely not stating that as any sort of definitive conclusion, but I think it's something the authors as well as other people reading the paper should consider as a possibility.

The conclusion about the rally causing over 250,000 cases nationwide does not appear to be the result of a rigorous analysis; it's in the discussion section, not the results section, and it's presented with no measure of uncertainty. It's not the main claim of the paper. To me it comes across as more of a hypothetical discussion point that should be taken with a grain of salt. But it's the part of the paper that is getting all the headlines and that everyone is sharing as if it's a fact and not a hypothetical.

I think that issues like this are all too common with popular media reporting of scientific research studies. But I also suspect that in this case the study authors may be partly to blame. I suspect they knew that the gaudy claims about 19% of all cases nationwide in August and $12 billion in health care costs would get a lot more attention than the much more strongly supported claims that the rally caused large COVID spikes in and around Sturgis and even around South Dakota as a whole state. And they probably wrote the discussion and promoted their findings with this in mind.

Did the Sturgis motorcycle rally cause increased COVID spread in Sturgis, surrounding counties, and the state of South Dakota? Undoubtedly, and this study makes that case quite well. Was there spread from the rally to many other parts of the country from which rally attendees traveled? This is also undoubtedly true.

Did that spread from the rally result in over 250,000 total cases, accounting for 19% of all new cases nationwide in August?

Maybe? But maybe not. Personally, I'm doubtful. And it's certainly not something that should be stated as a fact.

All people have a natural tendency to more easily accept claims that support the views of the world they already hold. As misinformation campaigns threaten to destroy our democracy, it's becoming more and more important to be able to distinguish fact from fiction, as well as to be able to distinguish claims that are very strongly supported by evidence from claims that might be true but aren't nearly as certain. For me, the widespread sharing of this story (one I initially took at face value before I looked into it) is a good reminder of that.

No comments:

Post a Comment