Part 1-Basic
2023-11-13
Introduction
Geo-spatial analysis is process of analyzing information by studying features such as location, attributes and their relationship that helps us uncover certain patterns happening at geographical level. For example, a company might want to look at how different retail stores are spread across a state and is there a possible white space opportunity available.Below are some of the other use cases where geo-spatial analysis can be used.
- Map certain metrics such as footprint, store productivity for a given location to draw a comparison among various stores
- White space opportunities
- Understand store coverage using catchment analysis(using constant time or constant area as catchment metrics)
- Competitor Intelligence: Understand the coverage between yours and competitor’s by answering key questions such as where are they located, what is their coverage, has their been a shift in certain location preferences (CBD Vs Heartland), etc
In this blog we will look at how to perform geo-spatial analysis through a step by step approach. We will discuss several use cases and how we answer certain key business questions.
Step 1:Libraries required in R
We would require libraries such as sf, leaflet, osrm, etc to play around with geo-spatial objects.These libraries also tend to use OpenStreetMap.It is a free, open geographic database updated and maintained by a community of volunteers via open collaboration. Contributors collect data from surveys, trace from aerial imagery and also import from other freely licensed geodata sources.
Lets install the libraries
## Loading required package: tidyverse
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.1 ✔ purrr 1.0.1
## ✔ tibble 3.1.8 ✔ dplyr 1.1.0
## ✔ tidyr 1.3.0 ✔ stringr 1.5.0
## ✔ readr 2.1.4 ✔ forcats 1.0.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## Loading required package: sf
## Warning: package 'sf' was built under R version 4.2.3
## Linking to GEOS 3.9.3, GDAL 3.5.2, PROJ 8.2.1; sf_use_s2() is TRUE
## Loading required package: osrm
## Warning: package 'osrm' was built under R version 4.2.3
## Data: (c) OpenStreetMap contributors, ODbL 1.0 - http://www.openstreetmap.org/copyright
## Routing: OSRM - http://project-osrm.org/
## Loading required package: leaflet
## Warning: package 'leaflet' was built under R version 4.2.3
## Loading required package: RColorBrewer
Step 2: Defining the business problem
Before doing any analysis, it is very important to identify and define the business problem.This ensures that the analysis is precise and targeted.
Lets say that McDonald want to identify some location in Indonesia where they can open their outlets.To approach this, we would be using population data and will try to narrow down on places that high population density.This would be our starting point. At the end of the analysis, we should be able to provide names of few cities which can be considered for such expansion.
Indonesia has a total of 38 provinces and then each province has several cities and towns. For our analysis, we would look at analyzing the following:
- Province level data
- Zooming on city level info for ach province
For instance, Bali is a province in Indonesia and it has a total of 8 cities as shown below:
Step 3: Get the shape file for Indonesia Province Level Map
Many BI experts that work in tableau know that to plot a geo-heat map, we need to have a shape file in place. The shape file is then read into tableau which has all the geographical details to plot the map. However, in R, I have experienced that it is very difficult to find a shape file online.In order to circumvent this situation, we leverage geo-json or json files. These files ar easily available on github repo and can be ready into R and converted into shape files.
Lets say we have to plot the population of various provinces in Indonesia.For this, we would require province level geo-json or json objects which can be read into R.
If you just search this on google, you might get the following options
Just open this repo
access the geo-json file
Download the file