In the following linear regression models, I tried to determine if there was any correlation between COVID19 deaths and an unhealthy lifestyle. I gathered data on COVID-19 deaths up until April 7th, as well as 2018 lifestyle data from the CDC. The results were inconclusive. I do not think this disproves a correlation or negative correlation between any of these factors because the data we currently have about COVID-19 is so limited. For instance there has been no anti-body testing and because of that we have no idea how many people in the US have already been infected. In my opinion these correlations would be more insightful if we had that serology data.
Given the limited data that we currently have, perhaps we could find a stronger correlation if we looked at other factors such as population density and use of public transportation.
Source Data:
https://github.com/nytimes/covid-19-data/blob/master/us-states.csv
https://chronicdata.cdc.gov/Nutrition-Physical-Activity-and-Obesity/Nutrition-Physical-Activity-and-Obesity-Behavioral/hn4x-zwk7