By Ayesha Rajan, Research Analyst for Altheia Predictive Health
Linear and exponential regression are similar but note that while the equation used for exponential regression is y=ab^x, the equation used for linear regression is y=a+bx. Based on our findings in the last article, we will start our linear regression at or near the date at which the total number of cases falls away from the exponential curve.
Data, Analysis and Discussion
Below is a graph using exponential regression from our last article:
Total Covid-19 Cases in the United States March 4th- 26th (Exponential Regression)
From this, we can see that the number of cases begins to fall away from the exponential curve around day 25 which was March 26th, 2020. That is the day we will label as Day 1 in this article. If we started from the same start point (March 2nd, 2020) as this graph then we would see skewed results that take the initial exponential nature into account; so, here is the new table of data we are working with:
Total Number of Covid-19 Cases in the United States
|Day||Date||Total Number of Cases (y)|
|1||March 25th, 2020||68,211|
|2||March 26th, 2020||85,435|
|3||March 27th, 2020||104,126|
|4||March 28th, 2020||123, 578|
|5||March 29th, 2020||143, 491|
|6||March 30th, 2020||163, 788|
|7||March 31st, 2020||188, 530|
|8||April 1st, 2020||215, 003|
|9||April 2nd, 2020||244, 877|
|10||April 3rd, 2020||277, 161|
|11||April 4th, 2020||311, 357|
|12||April 5th, 2020||336,673|
|13||April 6th, 2020||367,004|
|14||April 7th, 2020||400, 335|
|15||April 8th, 2020||434, 927|
Here is the resulting graph:
From this data, we have that a = 19,255.7429 and b= 26,463.83214. Though it may look like points jump over and under the line a bit, we can see a much tighter fit to the curve and also have a strong correlation coefficient at r = 0.994506795. From this, we can model out the next week as follows:
|Day||Date||Predicted Total Cases (y)|
|17||April 10th, 2020||410,249|
|18||April 11th, 2020||436,096|
|19||April 12th, 2020||461,943|
|20||April 13th, 2020||487,790|
|21||April 14th, 2020||513,637|
|22||April 15th, 2020||539,483|
|23||April 16th, 2020||565,330|
We can see that compared against last week’s prediction and actual case numbers these predictions seem much more realistic. If we analyze at the state level, we can see that some states are still on the initial exponential curve while others have become more linear as well.
In our last article we looked New York, Alabama, Colorado, Missouri, Louisiana and Oregon. Of these states, New York, Louisiana, Colorado and Oregon had a close fit to an exponential curve, though New York seemed to move away from the curve at or around March 24th. The data from Alabama and Missouri did not fit the exponential curve as well as the states mentioned.
Total Covid-19 Cases in New York March 28th– April 7th
For New York’s linear regression equation, we have y=11,980.97143+8,308.503571x.
Total Covid-19 Cases in Louisiana March 28th– April 7th
For Louisiana’s linear regression equation, we have y=1,239.06667+1,507.987879x.
Total Covid-19 Cases in Alabama March 29th– April 7th
|Total Covid-19 Cases in Missouri March 29th– April 7th
Total Covid-19 Cases in Colorado March 29th- April 7th
Total Covid-19 Cases in Colorado March 29th- April 7th
Recall from the last article the Imperial College Study that we used to predict demand for hospital beds. In the study, there were three possible scenarios based on levels of precautions taken: Optimistic Case, Most Likely Case and Worst Case. From the linear regression calculations done here, we can try to see which scenario we are leaning towards given the current situation and data. We will only predict up to June 1st as, according to the Imperial College Model, that is the latest peak we would see amongst all scenarios.
Predicted Total Covid-19 Cases in Six Example States
|May (1st)||Mid – May (15th)||June (1st)|
|New York||336,012||452,331||593, 576|
|Colorado||13, 630||18, 548||24, 518|
From STAT News, 5% of the total cases will need to be hospitalized so we can update the table as follows. Taking that into account, we need to look at the difference in the number of total cases between each month so we can see the number of new cases and then take 5% of that number to get the following table:
Predicted Increase in Demand for Hospital Beds Based on Linear Regression in Six Example States
|April 15th– May 1st||May 1st– 15th||May 15th – June 1st|
The delta change in demand for hospital beds in New York from May to June in Imperial College’s best case scenario is 15,838. In the worst case scenario it is 32,667. According to our own calculations, it is 19,523. As more data comes in, this could change but as of right now, it seems that we are falling somewhere between Imperial College’s best and most likely scenarios. This holds true for the rest of the states we have looked at as well with some even falling below the best case scenario.
From our own calculations, it seems clear that social distancing and closure precautions are working – the fact that number of cases is falling away from the exponential curve is proof of that. However, despite the fact that the number of cases and demand for hospital beds is optimistic, that does not ensure that certain hospitals will not be overwhelmed. Even in the best case scenario, many hospitals in smaller counties will still fall short of demand for hospital beds.
Begley, Sharon. “Coronavirus Model Shows Individual Hospitals What to Expect.” STAT, 16 Mar. 2020, www.statnews.com/2020/03/16/coronavirus-model-shows-hospitals-what-to-expect/.