This is a reprint A 2020 post titled “The Worst Statistical Analysis I’ve Seen This Year”The motivation is Mr Bruce Hallurge us to consider Judith Curry Link.
(and I’ve seen a lot of bad analysis) [Update 8/14/2020: the author has taken down the post, but here is an archived 8/13/2020 version of the webpage]
Now, Ms. Colleen Huber, NMD***, has concluded:
As of this writing, 32 weeks have passed in 2020. However, for each previous year, 52 weeks have passed. So how do we compare the number of all-cause deaths in 2020 to previous years?
I divided the total number of deaths per year by the number of weeks. That’s 52 weeks for all years except 2020, where 32 weeks have passed as of Saturday, August 8, 2020, the most recently updated week in the data cited by the CDC. This gives us the average number of deaths per week for each year and allows for meaningful comparisons between 2020 and previous years.
Then she generated the following table:
Her conclusion is:
There doesn’t appear to be a pandemic of COVID-19 or any other disease in 2020, at least not in the United States.
Ms. Huber told us that there are 52 weeks in a year, which is great. She divided the 2020 data by the 32 weeks the CDC has recorded (although reporting in recent weeks has been wildly incomplete).
It would be a sensible method – to calculate the weekly mortality – If there is no seasonality in the data.However, death yes In the United States, this is seasonal, as can be easily seen from the CDC data she is analyzing.
figure 1: CDC data accessed on August 7, 2020.
As we move into the second half of the year, deaths typically go up (flu, etc.). Therefore, using 32 weeks in 2020, and all 52 weeks in previous years, often results in meaningless comparisons. (A standard method used in many economic releases is a year-to-date count. That is, Ms. Huber could compare the number of deaths in the first 32 weeks of each year over the past 20 years to the number of deaths in the first 32 weeks of this year. This year .)
Once again, this is the most embarrassingly stupid data analysis I’ve seen this year (maybe this decade, although the competition is tough).
My investigation used CDC estimates of expected deaths, here.
*** “NMD” means “Naturopathic Doctor”
Appendix 9/26: I am very grateful to Mr. Hall for providing this extremely poor example of reasoning.I quote it every time I teach PA819 (Ungated Websites in 2020). I’m also grateful to Craig Eyermann (aka the Iron Man of Political Calculation), Steven Kopitz, and especially CoRev many other examples.