COVID-19 US map visualization by county — Python tutorial with Kepler.gl

Katz Nguyen
3 min readOct 30, 2020

Kepler.gl is a powerful, open-source geospatial analysis tool introduced by Uber since 2018. It is an awesome tool for data scientists to explore, visualize, and analyze geospatial data, especially for large-scale data sets.

Kepler.gl is FREE, light-weight, and can be run on Jupyter Notebook.

Step 1: Install Kepler.gl

Open Terminal or Command line, then use pip to install kepler.gl.

$ pip install keplergl

Check your Jupyter-notebook version by using:

$ jupyter — version

If you are using Mac and run Jupyter-notebook version 5.3 and above, you do not need to run the following

$ jupyter nbextension install — py — sys-prefix keplergl$ jupyter nbextension enable — py — sys-prefix keplergl

If you are using Jupyter Lab, you will also need to install the JupyterLab extension. This requires node > 10.15.0

If you use Homebrew on Mac:

$ brew install node@10

Then install Jupyter lab extension.

$ jupyter labextension install @jupyter-widgets/jupyterlab-manager keplergl-jupyter

Step 2: Create a base map (no data needed)

import pandas as pd
import sys
import json
import csv
from keplergl import KeplerGl
# Create a basemap
map1 = KeplerGl(height=600, width=800)

Step 3: Add data into your map

Kepler.gl expects your data to be CSV, JSON, GeoJSON, DataFrame, or GeoDataFrame. You can call add_data multiple times to add multiple datasets to kepler.gl

Step 3a: Add county border data

  • This step will help gather all information about the shape of each US county such as longitude, latitude, etc.
  • Download the most light-weight shapefile of US county map cb_2018_us_county_20m.zip (<1.0MB) from the official US Census website below.
  • Convert your shapefiles into a GeoJSON file by using this free tool below. You should upload everything, except the XML files, and export the combined file as a single GeoJSON file.

Tool: https://mapshaper.org/

  • Save the file name as counties.geojson

Step 3b: Add COVID-19 data to each county

  • You can use the New York Times comprehensive report of Coronavirus cases in the United States at county levels. This dataset has all the coronavirus cases and deaths for over 3200 counties starting from Jan 21st, 2020.

Link to dataset: https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv

  • Download and save it together with your Jupyter Notebook or python files, then use the code below to read the file.
# read the COVID-19 file
data = {}
geojson = {}
with open('us-counties.csv', 'r') as f:
reader = csv.DictReader(f);
for row in reader:
date = str(row['date'])
if (row['fips']!=''): statefip = int(row['fips'][0:2])
countyfip = int(row['fips'][2:5])
cases = int(row['cases'])
if statefip not in data: data[statefip] = {}

if countyfip not in data[statefip]:
data[statefip][countyfip] = [] newdict = {
'date' : date,
'cases': cases,
}
(data[statefip])[countyfip].append(newdict)
# Read the border data file and combine it with the COVID-19 file abovewith open('counties.geojson', 'r') as f:
geojson = json.load(f)

for feature in geojson['features']:
featureProperties = feature['properties']
statefp = int(featureProperties['STATEFP'])
countyfp = int(featureProperties['COUNTYFP'])

if (data.get(statefp) is not None):
for featureData in data.get(statefp).get(countyfp, {}):

for key in featureData.keys():
feature['properties'][key] = featureData[key]
# Add data to the map
map1.add_data(data=geojson, name='counties2')
# Initialize the map
map1

Finally, style the map!

Conclusions

Experimenting with large datasets (>100MB) in Kepler.gl, you will see high performance in loading, interacting with the map and the data, as well as freely filtering by dates.

Next, we will discover how to visualize COVID-19 data over time in all of the US counties. Let us see the beautiful dynamics of data visualization.

--

--

Katz Nguyen
0 Followers

We are a team of data scientist and designer. We love to help you explore data with beautiful visualization using Python and other tools!