On Counting and Overcounting Deaths

How many people died in the US from heart diseases in 2019? The answer is harder than it might seem to pin down. Using a broad definition, such as “major cardiovascular diseases,” and including any deaths where this was listed on the death certificate, the number for 2019 is an astonishing 1.56 million deaths, according to the CDC. That number is astonishing because there were 2.85 million deaths in total in the US, so over half of deaths involved the heart or circulatory system, at least in some way that was important enough for a doctor to list it on the death certificate.

However, if you Google “heart disease deaths US 2019,” you get only 659,041 deaths. The source? Once again, the CDC! So, what’s going on here? To get to the smaller number, the CDC narrows the definition in two ways. First, instead of all “major cardiovascular diseases,” they limit it to diseases that are specifically about the heart. For example, cerebrovascular deaths (deaths involving blood flow in the brain) are not including in the lower CDC total. This first limitation gets us down to 1.28 million.

But the bigger reduction is when they limit the count to the underlying cause of death, “the disease or injury that initiated the train of morbid events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury,” as opposed to other contributing causes. That’s how we cut the total in half from 1.28 million to 659,041 deaths.

We could further limit this to “Atherosclerotic heart disease,” a subset of heart disease deaths, but the largest single cause of deaths in the coding system that the CDC uses. There were 163,502 deaths of this kind in 2019, if you use the underlying cause of death only. But if we expand it to any listing of this disease on the death certificate, it doubles to 321,812 deaths. And now three categories of death are slightly larger in this “multiple cause of death” query, including a catch-all “Cardiac arrest, unspecified” category with 352,010 deaths in 2019.

So, what’s the right number? What’s the point of all this discussion? Here’s my question to you: did you ever hear of a debate about whether we were “overcounting” heart disease deaths in 2019? I don’t think I’ve ever heard of it. Probably there were occasional debates among the experts in this area, but never among the general public.

COVID-19 is different. The allegation of “overcounting” COVID deaths began almost right away in 2020, with prominent people claiming that the numbers being reported are basically useless because, for example, a fatal motorcycle death was briefly included in COVID death totals in Florida (people are still using this example!).

A more serious critique of COVID death counting was in a recent op-ed in the Washington Post. The argument here is serious and sober, and not trying to push a particular viewpoint as far as I can tell (contrast this with people pushing the motorcycle death story). Yet still the op-ed is almost totally lacking in data, especially on COVID deaths (there is some data on COVID hospitalizations).

But most of the data she is asking for in the op-ed is readily available. While we don’t have death totals for all individuals that tested positive for COVID-19 at some point, we do have the following data available on a weekly basis. First, we have the “surveillance data” on deaths that was released by states and aggregated by the CDC. These were “the numbers” that you probably saw constantly discussed, sometimes daily, in the media during the height of the pandemic waves. The second and third sources of COVID death data are similar to the heart disease data I discussed above, from the CDC WONDER database, separated by whether COVID was the underlying cause or whether it was one among several contributing causes (whether it was underlying or not).

Those three measures of COVID deaths are displayed in this chart:

You will notice a few things in the chart, but the most obvious is that these measures of deaths move pretty closely together. And just eyeballing it (more on this later), it looks like that for most of 2020 and 2021, surveillance and underlying cause deaths lined up very closely. This is true even though with the surveillance data we’re dealing with COVID death reports for all 50 states, which are sometimes just state aggregations from county reports, all without consistent reporting periods nor necessarily totally consistent definitions.

However, these did not wash out completely. Instead, it’s the “all COVID deaths” (blue line) that ended up being very similar to the surveillance deaths in the first two years of the pandemic. By the end of 2021, they were very close: 822,000 surveillance deaths reported, and 837,000 COVID underlying deaths reported. Those reports you heard on the nightly news? They slightly undercounted the number of COVID deaths the CDC would eventually report from deaths certificates, but these are pretty darn close. By contrast, there were about 759,000 deaths which listed COVID as the underlying cause around the end of 2021, or about 91-92% of these larger death counts.

But then in 2022, something changed. Notice that the green line (surveillance deaths) is now consistently above the yellow line (underlying deaths). Even the blue line, which shows all deaths listing COVID on a death certificate, are even below the surveillance deaths. In 2020 and 2021, they were often higher for many weeks. Data lags could have something to do with this, but probably not for the first 3/4 of 2022. Also, you will notice that underlying COVID deaths (yellow line) now appears to have a bigger gap with all COVID deaths (blue line). This fact is more clearly demonstrated in the following chart:

In this chart, we clearly see that of all COVID deaths in the US, about 90% listed COVID-19 as the underlying cause of death in 2020 and 2021. Then in early 2022, that percentage quickly declined to the 60-70% range (averaging 68% over the entire year, with data so far available). I could also plot underlying COVID deaths as a percent of surveillance deaths, but that ends up being a lot noisier than the first chart suggests, sometimes being well above 100%, due to small variations in weekly reporting of deaths. So, it’s not especially useful on a weekly or even monthly basis, but if we compare the two for all of 2022 it’s around 60%.

What does this all tell us? Well, I think quite a bit, but perhaps the most important less for right now is that when you see daily or weekly COVID numbers reported in the media, you need to interpret those quite a bit different from the first two years of the pandemic. It’s not that deaths where COVID was only a contributing cause are unimportant. They are still important. But they are, in many ways, different.

But the lesson for the entire pandemic? During 2020 and 2021, when you heard those daily updates of COVID deaths, they were pretty accurate. Certain the cyclical fluctuations in mortality were being accurately captured by those reports, and even the levels were about right: pretty near 100% of all COVID deaths, and about 90% of those levels if you only think underlying COVID deaths are important to count.

Finally, even with the much lower COVID underlying death numbers in 2022 (in total, and as a percent of all COVID deaths), it will likely still be the third leading underlying cause of death as defined by the CDC categories. Heart disease and cancer will lead the way again, but there are already 180,000 underlying COVID deaths in 2022 in the US. This will almost certainly be more than the next large category, which is accidental poisonings (primarily drug ODs), which was about 100,000 deaths in 2021. While OD deaths rose dramatically in 2020 and 2021, early data indicates that they leveled off (though unfortunately didn’t decrease) in 2022.

2 thoughts on “On Counting and Overcounting Deaths

  1. StickerShockTrooper January 18, 2023 / 1:43 pm

    So what caused that dramatic drop that happened to start right at the last week of 2021? I’m suspicious of things that coincide with arbitrary date transitions. My first reaction was a change in reporting protocol starting in 2022, that got smeared by multi-week averaging…


