Uses MIT/MEDSL, New York Times, and US Census datasources to analyze per-county COVID-19 deaths.

Last update: Dec 22, 2021

Related tags

Data Analysis covid-county

Overview

Covid County

Executive summary

Setup

Install miniconda, then in the command line, run

conda create -n covid-county
conda activate covid-county
conda install pandas ipython matplotlib tabulate

(Let me know if you want pure-Python no-Conda instructions via venv.)

2020 US presidential election

I've already downloaded countypres_2000-2020.csv from https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/VOQCHQ but you can download it again to ensure I haven't committed bad data.

2020 data is missing counts for District of Columbia (FIPS 11001)? Party split taken from 2016 election.

Census

From https://www.census.gov/programs-surveys/popest/technical-documentation/research/evaluation-estimates/2020-evaluation-estimates/2010s-counties-total.html I downloaded co-est2020.csv from the "Annual Resident Population Estimates for States and Counties: April 1, 2010 to July 1, 2019; April 1, 2020; and July 1, 2020 (CO-EST2020)" link. It's committed in this repo but you can download it yourself too.

Covid

Install Git and run this in this directory: git clone --depth 1 https://github.com/nytimes/covid-19-data.git (it might take a while)

Note five boroughs of NYC are combined into a single "county". This is taken into account by merging the 2020 Presidential votes from all five boroughs into a single county (since we can't split the Covid deaths into individual boroughs, this is the best we can do). Fix follows the recommendation per upstream issue 105.

Run

python main.py

(Takes ~45 seconds on my 2015-vintage laptop.)

More results

party bin	total Covid-19 deaths
Rep 80+%	38284
Rep 60–79%	211416
Rep 50–59%	123587
Dem 50–59%	196084
Dem 60–79%	210070
Dem 80+%	18331
unknown	5243

Simply by party:

Dem: 424485
Rep: 373287

Uses MIT/MEDSL, New York Times, and US Census datasources to analyze per-county COVID-19 deaths.

Related tags

Overview

Covid County

Executive summary

Setup

2020 US presidential election

Census

Covid

Run

More results

Owner

Ahmed Fasih

Program that predicts the NBA mvp based on data from previous years.

PipeChain is a utility library for creating functional pipelines.

Toolchest provides APIs for scientific and bioinformatic data analysis.

Maximum Covariance Analysis in Python

Multiple Pairwise Comparisons (Post Hoc) Tests in Python

simple way to build the declarative and destributed data pipelines with python

A CLI tool to reduce the friction between data scientists by reducing git conflicts removing notebook metadata and gracefully resolving git conflicts.

Python tools for querying and manipulating BIDS datasets.

PySpark bindings for H3, a hierarchical hexagonal geospatial indexing system

Snakemake workflow for converting FASTQ files to self-contained CRAM files with maximum lossless compression.

Dbt-core - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

The Master's in Data Science Program run by the Faculty of Mathematics and Information Science

VHub - An API that permits uploading of vulnerability datasets and return of the serialized data

Handle, manipulate, and convert data with units in Python

vartests is a Python library to perform some statistic tests to evaluate Value at Risk (VaR) Models

🌍 Create 3d-printable STLs from satellite elevation data 🌏

Airflow ETL With EKS EFS Sagemaker

LynxKite: a complete graph data science platform for very large graphs and other datasets.

A Python module for clustering creators of social media content into networks

A columnar data container that can be compressed.