Trends in Covid-19 Spread

Summary

The John Hopkins University (JHU) collects data on how the COVID-19 virus spreads over the world and makes it available on github. In this analysis, we take the data and extract the numbers that describe how the growth of casualties develops. Looking at the trends in these numbers, one may deduce how fast or how slow the virus is spreading and also compare how the situation evolves in different countries.

Feel free to use the python implementation that produces the presented plots and that can be modified to show the results for any country that is in the JHU data base.

This post bases on joint efforts and discussions of the CSC group at the MPI Magdeburg.

The numbers in the presented plots are updated automatically1.

Table of Contents

Why This Analysis

If one looks at the actual numbers generated by an exponential growth, the overall growth is so strong, that one barely sees changes in the dynamics. If one looks at the numbers in log scale, the data becomes more handy. In log scale an exponential growth looks like a straight line and the slope of the line indicates how fast the exponential growth is.

To see whether the exponential growth changes its dynamics, one may check whether the line in the logarithmic plot changes its slope (hopefully it gets less steep). That’s why we plot the slope of the logarithmic line that connects the numbers of two consecutive days. Since the data is wiggly, we also plot averages of the slopes over 2 or 5 days to better spot the trends.

This post was inspired by an illustration that was discussed on twitter2 in the last days:

Lockdowns and the numbers of deaths in Italy and Hubei
It shows how, seemingly, the exponential growth in casualties did decrease some days after lockdowns have been implemented. In this analysis, we try to automatically detect such changes in the growth for a number of countries and represent it in an accessible form that allows easy comparisons.

The Analysis

We examine the (logarithmic) slope of the curves of the accumulated casualties per day. That is for day d and the day d-1 before that day, we plot log2(x[d]) - log2(x[d-1]), where x holds the number of accumulated deaths for every day and where log2 is a function that computes the logarithm of a number with respect to the basis3 2.

Illustrative Examples

To make sense of the numbers, we start with some example scenarios. In the plot below, we have plotted the values for some fictitious growth scenarios.

  • An exponential growth – every day d the number of additional casualties is 1.1**d (speak: 1.1 to the power of d and think of a daily increase by 10%). This scenario leads to a constant value in the plots of around 0.14.

  • A constant growth – every day another 10 casualties are added. This is like the number of daily deaths due to traffic incidents in Germany4 in 2010.

  • A growth that decreases exponentially – every day the number of additional casualties gets smaller exponentially.

The slopes of a logarithmic plot of the accumulated casualties for the example scenarios

General Explanations of the Numbers

  • A constant value means that the number of deaths grows exponentially.
    • If this constant is 1, this means that the numbers of casualties doubles every day.
    • A value of about 0.5 means a daily plus of 40%.
  • A decreasing curve indicates that the exponential growth of seriously infected people is stalled or reversed.
  • If the value approaches 0, this indicates that the virus is contained.

What It Would Look Like and What It Should Look Like

To have a comparison of what an uncontrolled spread and what a well controlled spread would look like, we ran two simulations in a covid simulator5.

  1. A scenario for Germany in which no interventions are taken.
  2. A scenario for which we defined interventions such that the number of hospitalized patients that require intensive care (ICU) was always below 45000.
The slopes of a logarithmic plot of the accumulated casualties and the numbers of people in need for intensive care due to an infection by COVID-19 for the simulated scenarios. The kink in the red curve is due to the stop of some interventions after 205 days.

In the uncontrolled case, exponential growth is detected with a slope of about 0.18. After some time, when most people have been infected the, growth decreases down to zero.

In the controlled scenario, the rate is brought down in the initial phase. Then, exponential growth happens at a lower rate (in the plot the value is about 0.07) though for a longer time before it fades out.

To relate the slopes to the actual cases, in a second plot, we display the corresponding numbers of people that will need intensive care in a hospital. In our model, we assumed that 1.15% of the infected people will have to be treated in an intensive care unit.

The Actual Numbers

As of the date indicated in the plots, the JHU data delivers the following numbers. We plot the logarithmic slopes for several countries for the last 50 days. See the pdf file for more countries and a better resolution of the plots.

The slopes of a logarithmic plot of the accumulated casualties for a number of countries. The colored background marks the days where the number of deaths was below 100, 300, 900, 2700, 8100, 24300, …, respectively. The upper 6 countries are always shown. The lower 2 countries change occasionally. See the pdf for all countries.

Explanations of the Data Representation

Because of natural fluctuations and, maybe, because of varying delays in the data transmission to JHU, the data does not describe a nice, smooth curve. Accordingly, the computed slopes can vary a lot, what leads to the point clouds like in the plot for France ๐Ÿ‡ซ๐Ÿ‡ท. The smoothing effect of the 2 days and 5 days average makes the corresponding data points look smoother. There are a few points that lie outside the range of the plots. However, we do not adjust the plots to fit all outliers, since then the interesting parts of the curve will be less well resolved. On the other hand, the plot range of China ๐Ÿ‡จ๐Ÿ‡ณ is adjusted to better resolve the small numbers.

Some Interpretation of the JHU Data

As of today, one may say that:

  • In China ๐Ÿ‡จ๐Ÿ‡ณ the spread seems to be under control.
  • Italy ๐Ÿ‡ฎ๐Ÿ‡น seems to have weakened the exponential growth.
  • For Spain ๐Ÿ‡ช๐Ÿ‡ธ the numbers report a visible decrease in the growth rate. The rate is still high though. However if this decrease goes on, Spain might not overtake Italy.
  • A comparison of France ๐Ÿ‡ซ๐Ÿ‡ท and Spain shows the effects of low growth rates. In the last days, the rates were similar. And although the virus seems to have hit France before it hit Spain, the total numbers in France are smaller by a factor of 3.
  • In Germany ๐Ÿ‡ฉ๐Ÿ‡ช the rates seem to be decreasing but are still high. However since the decrease started earlier than for Spain, France, or Italy, there is reason to assume that Germany will suffer less from the virus (provided that the trend is stable).
We will update the plots and the interpretations on a daily basis.

Other Things that can be Seen

  • The curves seem to have similar phases. A phase of exponential growth followed by a decrease.
  • The curves differ in the length of the phases (France seems to have a long phase of exponential growth if compared to Spain)
  • And the starting points are different which gives a hint on when the outbreak became visible in the different countries.

Notes and Acknowledgements

On the data

Certainly, the casualties lag6 behind the actual spreading of the virus by a number of days. However, one may think that numbers of casualties are a more reliable data point than the number of infected.

Disclaimer

This analysis is purely based on empirical trends. No statistical data analysis tools have been applied to (pre-)process the data, like data denoising (except that some outliers might not be shown because of the plot margins).

Acknowledgements

The initial work, namely making the data easily available in python as well as the code that produces the title picture, was done by Petar Mlinariฤ‡.


  1. Updates once a day in the morning. ↩︎

  2. The plot is by the Italian Physicist Prof. Ricci-Tersenghi ↩︎

  3. The basis is not too important here. If one takes 2, then a value of 1 means a doubling. A different basis would only scale the plots but not change the qualitative outcomes. ↩︎

  4. https://de.wikipedia.org/wiki/Verkehrstod#Deutschland ↩︎

  5. covidsim.eu with parameters as recorded in the screenshots 1 and 2 based on the assumptions used of the RKI publication mentioned in the footnote below. ↩︎

  6. In a recent study – the DOI doesn’t work yet(?) – the Robert Koch Institute assumes an average value of 20 days between the infection and death. ↩︎