Earlier this week I had a discussion with a friend about the latest Covid-19 trends. Well, as for everybody, this is the only discussion we seem to have right now. The curve, cases, deaths, today’s numbers, does it flatten, doesn’t it. It is a tragic topic, and let’s hope that we soon can start talking about normal stuff again, like the weather, football games, the latest, best burger place and whether to love or hate e-scooters. But for now, we try to make sense of The Curve and, critically, try to find out where on that curve we currently may find ourselves and if we could dare to hope that we could be on the right end of the mountain top.
My friend and I tried to figure things out, as we have tried to do every day since march. “I don´t understand why they [the experts of choice] say that we have decreasing numbers. Today we had 130 cases and yesterday we had 80!”
If the so called current situation has taught us anything, it is that epidemiology is not an easy science and that models of disease spread are incredibly complex. As everybody is looking at the curve these days, it seems timely to look closer at one indicator that I find helpful in order to understand the overall picture better: The Moving Average.
Let’s look at the current situation in Sweden. The European Centre for Disease Prevention and Control (ECDC) publishes daily Corona case and deaths data. If we look at the current Covid-19 death numbers in Sweden over time from January until today it looks like this:
We see the that every 5 days there is a sharp decrease in numbers, as reporting on weekends lags behind. Then there is a catch up effect, when the weekend deaths are reported in the following days. There are steep peaks when that happens. Overall, there might be a trend of fewer deaths overall, but it is hard to tell.
To see a trend we would need to smoothen out the peaks and the valleys somehow and compensate for the underreporting on weekends and the overreporting on weekdays when the weekend numbers come in.
The Moving Average (MA, even called Simple Moving Average, SMA) does just that. In short, if we pick a 7 day moving average, it calculates, say, the average of monday to sunday. If the numbers for monday to sunday would be 72 + 94 + 121 + 31 + 67 + 113 + 98 = 596, the moving average would be 596 / 7 days = 85.1. We would then go on and calculate the next moving average for tuesday to monday, and so on. We end up with a number for each day, but it is the average of the 7 days surrounding it, so to speak.
Here we see the moving average curve for the same data as above:
To see how the moving average smoothens out the day-to-day numbers, we can put the curves on top of each other:
The light blue line represents the day-to-day numbers, the dark gray line is the moving average. We see clearly now that we are on a lower part of the curve now, that we had reached a peak in the second half of april, and that we have about the same numbers now as in the second week of april.
Thus, the moving average gives us a better view of the overall picture, and is probably one of the metrics that the experts refer to when talking about decreasing numbers, even if the number of the day is higher than yesterday’s.
I hope this was helpful. For those of you who are into R and want to explore the data on your own, please find the code to these graphs below.
# Get the dada from ECDC library(utils) data <- read.csv("https://opendata.ecdc.europa.eu/covid19/casedistribution/csv", na.strings = "", fileEncoding = "UTF-8-BOM") data$dateRep <- as.Date(data$dateRep, "%d/%m/%Y") data$Date <- data$dateRep # Subset the Swedish data data.swe <- subset(data, countriesAndTerritories == "Sweden") # Plot the day to day deaths library('ggplot2') ggplot(data = data.swe, aes(x=Date, y=deaths)) + geom_line() + theme_minimal() # plot the moving average library('forecast') library('tseries') data.swe$deaths.ma7 = ma(data.swe$deaths, order=7) ggplot(data = data.swe, aes(x=Date, y=deaths.ma7)) + geom_line() + theme_minimal() # Plot the moving average together with the day-to-day numbers ggplot() + geom_line(data = data.swe, aes(x=Date, y=deaths), color='steelblue', size=1, alpha=0.5) + geom_line(data = data.swe, aes(x=Date, y=deaths.ma7), color='tomato', size=1.5, alpha=0.8) + theme_minimal()