library(tigris) ## to get shape files
library(leaflet) ## making maps
library(sf) ## spatial wrangling
library(tidyverse) ## general utilities
How to get Philadelphia zcta boundaries with TIGRIS
Issue
do you have acces to the shape files for the philly neighborhoods? and would you be able to share?
Tigris package
tigris is an R package that allows users to directly download and use TIGER/Line shapefiles (https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html) from the US Census Bureau. Below are some useful links:
Package repository: https://github.com/walkerke/tigris
Book chapter from
Analyzing US Data
book: https://walker-data.com/census-r/census-geographic-data-and-applications-in-r.htmlPython version of tigris: https://walker-data.com/pygris/
Below we will go through how to use the tigris package to get ZIP Code Tabulation Areas (ZCTA) boundaries for Philadelphia County (42101).
Workflow
We will be using three packages
Lets first get two boundaries: 1) Philadelphia county and 2) All zcta in Philadelphia county.
= counties("Pennsylvania",
philadelphia_county_sf cb = TRUE,
class = "sf") %>%
filter(NAME == "Philadelphia") %>%
select(GEOID, NAME, geometry)
= zctas(filter_by = philadelphia_county_sf ) %>%
stg_philadelphia_zcta_sf select(ZCTA5CE20, geometry)
Lets map our results to make sure we have the correct things.
leaflet() %>%
addTiles() %>%
addPolygons(data = stg_philadelphia_zcta_sf,
fillOpacity = 0,
weight = 1,
color = "blue",
opacity = 1) %>%
addPolygons(data = philadelphia_county_sf,
fillOpacity = 0,
weight = 3,
color = 'black',
opacity = 1)
The map above shows Philadelphia County boundaries as black bold line. The CTA boundaries are the blue lines. We are almost there, it seems TIGRIS returned any zcta that is even touching Philadelphia county boundaries including zctas that are just on the edge. Lets get rid of those with some basic spatial wrangling below.
## Calculate overlap between each zcta and philadelphia county
= st_intersection(stg_philadelphia_zcta_sf,
xwalk_overlap %>%
philadelphia_county_sf) mutate(area_overlap = st_area(geometry)) %>%
as_tibble() %>%
select(ZCTA5CE20,area_overlap )
## calculate the proportion of each zcta that is in philadelphia
= stg_philadelphia_zcta_sf %>%
int_philadelphia_zcta_sf left_join(xwalk_overlap, by = "ZCTA5CE20") %>%
mutate(zcta_area = st_area(geometry),
overlap = as.numeric(area_overlap/zcta_area))
## We only want to remove those that have very little overlap. Lets keep anything with
## greater than 25% overlap
= int_philadelphia_zcta_sf %>%
philadelphia_zcta_sf filter(overlap > 0.25)
Let check our spatial wrangling worked.
leaflet() %>%
addTiles() %>%
addPolygons(data = philadelphia_county_sf,
fillOpacity = 0,
weight = 5,
color = 'black',
opacity = 0.4) %>%
addPolygons(data = philadelphia_zcta_sf,
fillOpacity = 0,
weight = 1,
color = "blue",
opacity = 1)
Looks great! Now we have boundaries for both the Philadelphia Country as well as all zcta within Philadelphia.