Maps in R: Plotting data points on a map

In the introductory post of this series I showed how to plot empty maps in R.
Today I'll begin to show how to add data to R maps. The topic of this post is the visualization of data points on a map.

We will use a couple of datasets from the OpenFlight website for our examples.
After loading the airports.dat file let's visualize the first few lines.

?View Code RSPLUS
> airports <- read.csv("http://openflights.svn.sourceforge.net/viewvc/openflights/openflights/data/airports.dat", header = FALSE)
> colnames(airports) <- c("ID", "name", "city", "country", "IATA_FAA", "ICAO", "lat", "lon", "altitude", "timezone", "DST")
> head(airports)
  ID                       name         city          country IATA_FAA ICAO       lat      lon altitude timezone DST
1  1                     Goroka       Goroka Papua New Guinea      GKA AYGA -6.081689 145.3919     5282       10   U
2  2                     Madang       Madang Papua New Guinea      MAG AYMD -5.207083 145.7887       20       10   U
3  3                Mount Hagen  Mount Hagen Papua New Guinea      HGU AYMH -5.826789 144.2959     5388       10   U
4  4                     Nadzab       Nadzab Papua New Guinea      LAE AYNZ -6.569828 146.7262      239       10   U
5  5 Port Moresby Jacksons Intl Port Moresby Papua New Guinea      POM AYPY -9.443383 147.2200      146       10   U
6  6                 Wewak Intl        Wewak Papua New Guinea      WWK AYWK -3.583828 143.6692       19       10   U

Latitude and longitude are reported for every airport in the dataset.
Let's draw the map of Europe with the help of rworldmap package, as was shown in the previous post on maps:

?View Code RSPLUS
> library(rworldmap)
> newmap <- getMap(resolution = "low")
> plot(newmap, xlim = c(-20, 59), ylim = c(35, 71), asp = 1)

Then we can easily lay the airports over the map:

?View Code RSPLUS
> points(airports$lon, airports$lat, col = "red", cex = .6)

Map of European Airports

Adding dimensions

In the introductory post I mentioned that ggmap actually builds on the ggplot graphics engine, thus all the strengths of ggplot are available when mapping data with ggmap.
Here I will show a couple of examples on how to take advantage of this.

Let's load another dataset from OpenFlights in R.

?View Code RSPLUS
> routes <- read.csv("http://openflights.svn.sourceforge.net/viewvc/openflights/openflights/data/routes.dat", header=F)
> colnames(routes) <- c("airline", "airlineID", "sourceAirport", "sourceAirportID", "destinationAirport", "destinationAirportID", "codeshare", "stops", "equipment")
> head(routes)
  airline airlineID sourceAirport sourceAirportID destinationAirport destinationAirportID codeshare stops equipment
1      2B       410           AER            2965                DME                 4029               0       CR2
2      2B       410           ASF            2966                LED                 2948               0       CR2
3      2B       410           CEK            2968                DME                 4029               0       CR2
4      2B       410           CEK            2968                KZN                 2990               0       CR2
5      2B       410           CEK            2968                OVB                 4078               0       CR2
6      2B       410           DME            4029                AER                 2965               0       CR2

Starting from the routes dataset, let's count the both number of routes departing from and arriving to a particular airport. I'm using another very useful package by Hadley Wickham for this task.

?View Code RSPLUS
> library(plyr)
> departures <- ddply(routes, .(sourceAirportID), "nrow")
> names(departures)[2] <- "flights"
> arrivals <- ddply(routes, .(destinationAirportID), "nrow")
> names(arrivals)[2] <- "flights"

Then, let's add the info on departing and arriving flights to the airports dataset (which contains the coordinates data.)

?View Code RSPLUS
> airportD <- merge(airports, departures, by.x = "ID", by.y = "sourceAirportID")
> airportA <- merge(airports, arrivals, by.x = "ID", by.y = "destinationAirportID")

The goal is now to plot the airports on the map of Europe as circles whose area is proportional to the number of departing flights.

The first step is to get the map from GoogleMaps (or one of the other available services), like was shown last time.

?View Code RSPLUS
> library(ggmap)
> map <- get_map(location = 'Europe', zoom = 4)

The following lines already get us quite close to producing the desired chart.

?View Code RSPLUS
> mapPoints <- ggmap(map) +
+   geom_point(aes(x = lon, y = lat, size = sqrt(flights)), data = airportD, alpha = .5)

The ggmap command prepares the drawing of the map. The geom_point function adds the layer of data points, as would be normally done in a ggplot. A thorough explanation of ggplot is well beyond the scope of this post, but here are quick details on what is passed to geom_point:
- aes indicates how aesthetics (points in this case) are to be generated; the lon variable is associated to the x axis, lat to y, and the size of the points is proportional to the value of the variable flights (actually to its square root;)
- data indicates the dataset where the variable passed to aes are to be found;
- the alpha parameter controls the transparency of the plotted points (some degree of transparency will make the overlapping circles distinguishable.)

And here's what appears on the R plotting window when one types mapPoints in the console.

European Airports and departing routes

A few tweaks to the legend (so that it does report the actual number of departures rather than the square root,) and the chart is ready for publication.

?View Code RSPLUS
> mapPointsLegend <- mapPoints +
+   scale_area(breaks = sqrt(c(1, 5, 10, 50, 100, 500)), labels = c(1, 5, 10, 50, 100, 500), name = "departing routes")
> mapPointsLegend

European Airports by departing routes

Once more, the map is a ggplot (type class(mapPoints) in your console to check) thus a nearly unlimited set of operations can be performed to improve it. For example, the number of departing flights could be portrayed by the color of the circles rather than their dimension.

As a final example for this post, I'll show the code to perform faceting. In other words we will have a couple of panels, one reporting the departing flights, the other the incoming ones.

?View Code RSPLUS
# create the data set containing both departures and arrivals
> airportD$type <- "departures"
> airportA$type <- "arrivals"
> airportDA <- rbind(airportD, airportA)
 
# map the data
# map + data points
> mapPointsDA <- ggmap(map) +
+   geom_point(aes(x = lon, y = lat, size = sqrt(flights)), data = airportDA, alpha = .5)
# adjust the legend
> mapPointsLegendDA <- mapPointsDA +
+   scale_area(breaks = sqrt(c(1, 5, 10, 50, 100, 500)), labels = c(1, 5, 10, 50, 100, 500), name = "routes")
# panels according to type (departure/arrival)
> mapPointsFacetsDA <- mapPointsLegendDA +
+   facet_grid(. ~ type)
# plot the map
> mapPointsFacetsDA


European Airports by departing and incoming routes

What's next

Next time we will deal with geographically aggregated data and how to display them in choropleth maps.

View (and download) the full code:

?Download maps2.R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
airports <- read.csv("http://openflights.svn.sourceforge.net/viewvc/openflights/openflights/data/airports.dat", header = FALSE)
colnames(airports) <- c("ID", "name", "city", "country", "IATA_FAA", "ICAO", "lat", "lon", "altitude", "timezone", "DST")
head(airports)
 
library(rworldmap)
newmap <- getMap(resolution = "low")
plot(newmap, xlim = c(-20, 59), ylim = c(35, 71), asp = 1)
 
points(airports$lon, airports$lat, col = "red", cex = .6)
 
routes <- read.csv("http://openflights.svn.sourceforge.net/viewvc/openflights/openflights/data/routes.dat", header=F)
colnames(routes) <- c("airline", "airlineID", "sourceAirport", "sourceAirportID", "destinationAirport", "destinationAirportID", "codeshare", "stops", "equipment")
head(routes)
 
library(plyr)
departures <- ddply(routes, .(sourceAirportID), "nrow")
names(departures)[2] <- "flights"
arrivals <- ddply(routes, .(destinationAirportID), "nrow")
names(arrivals)[2] <- "flights"
 
airportD <- merge(airports, departures, by.x = "ID", by.y = "sourceAirportID")
airportA <- merge(airports, arrivals, by.x = "ID", by.y = "destinationAirportID")
 
library(ggmap)
map <- get_map(location = 'Europe', zoom = 4)
 
mapPoints <- ggmap(map) +
  geom_point(aes(x = lon, y = lat, size = sqrt(flights)), data = airportD, alpha = .5)
 
mapPointsLegend <- mapPoints +
  scale_area(breaks = sqrt(c(1, 5, 10, 50, 100, 500)), labels = c(1, 5, 10, 50, 100, 500), name = "departing routes")
mapPointsLegend
 
 
# create the data set containing both departures and arrivals
airportD$type <- "departures"
airportA$type <- "arrivals"
airportDA <- rbind(airportD, airportA)
 
# map the data
# map + data points
mapPointsDA <- ggmap(map) +
  geom_point(aes(x = lon, y = lat, size = sqrt(flights)), data = airportDA, alpha = .5)
# adjust the legend
mapPointsLegendDA <- mapPointsDA +
  scale_area(breaks = sqrt(c(1, 5, 10, 50, 100, 500)), labels = c(1, 5, 10, 50, 100, 500), name = "routes")
# panels according to type (departure/arrival)
mapPointsFacetsDA <- mapPointsLegendDA +
  facet_grid(. ~ type)
# plot the map
mapPointsFacetsDA

About Max Marchi

Statistician on infectious disease data for the Regione Emilia-Romagna by day. Sport data analyst for Baseball Prospectus by night.
This entry was posted in R and tagged , , , , . Bookmark the permalink.

6 Responses to Maps in R: Plotting data points on a map

  1. Robert says:

    I frequently use the argument "scale = free_y" when faceting regular plots. Does such an option exist when plotting maps to allow different panels to sort of zoom in on particular areas?

  2. andy south says:

    Hi Max,
    Nice post and great data source, thanks !

    Here's something else you could do with rworldmap using mapBubbles(). I just did this very quickly there are other potential refinements I may look at.

    #getting started with default options
    mapBubbles( airportA, nameX='lon', nameY='lat', nameZSize='flights', nameZColour='flights', catMethod='quantiles' )

    #refining
    mapBubbles( airportA, nameX='lon', nameY='lat', nameZSize='flights', nameZColour='flights', colourPalette='topo', catMethod='pretty', numCats=6, mapRegion='Europe', colourLegendPos='bottomright', addLegend=FALSE )

    #shameless plug !
    mtext("map made using rworldmap by Andy South", line=-1, side=3, adj=1, cex=0.6)

    If users want other hints on rworldmap they can look at my paper : http://journal.r-project.org/archive/2011-1/RJournal_2011-1_South.pdf.

    New developments coming soon.
    All the best,
    Andy

  3. kayhan says:

    Hi,

    Thanks you for your post. I think to learn how to draw a heat map on the map, I should wait for your next post :(

    Thanks again,

  4. Max Marchi says:

    Robert, for what I know, you can use all your regular ggplot stuff on the maps.

    Andy, thank you for your code.

    Kayhan, next time I'll show how to place colored polygons on maps. I'll explicitly deal with choropleth maps, but the process should be useful for overlaying heatmaps as well.
    Meanwhile you can look at the following links if you find something useful for you:
    http://www.r-bloggers.com/hurricane-sandy-land-wind-speed-and-kriging/
    http://spatialanalysis.co.uk/r/

    • kayhan says:

      Thanks Max,

      Let me tell you what I want to do:

      I have crime count (of 17 types) of Cincinnati over regular grid of size 0.02 x 0.02 (lat x lon). I would like to get the map (eg from Google) and overlay crime count over it while I can still see the underlying map. The crime count in each cell is an integer number, and the result should look like a heat map over the map of Cincinnati.

      Thanks for the link, I will look into them.

      • Max Marchi says:

        You can use ggmap to get the map (as illustrated in this post) and add layers that draw a level plot on top of it. Then play with the alpha parameter to find the desired transparency level.

        If the coordinates system and projection in your data is the same of GoogleMaps, that's it.

        This link (http://www.r-bloggers.com/displaying-data-using-level-plots/) has some code for producing level plots with ggplot2.

        Time permitting, I'll try to have a specific post on this, but I can't make promises on when it's happening.

        I'm taking suggestions for open data to use for the example (given it's MilanoR's blog, preferably Italian or European data.)

Leave a Reply