Trajectory Analysis in R

This is a re-post of my 2013 Google Summer of Code project with the 52° North community..

“This project aimed to develop an R package named trajectories that is specifically catered for trajectory analysis based on existing R     packages available such as sp, spacetime. Since the beginning of the project, progress has been made in three main aspects. First… read more
Advertisements

Map Zoom and Pan in Processing

Zooming and panning are very common functions in interactive maps and other similar image-viewing applications. In this blog post, I included a snippet of Processing code that allows zooming and panning on a map or other images using intuitive mouse gestures, which mimic a Google-Map-like experience.

Last semester, I took a programming course in Cartography Design, and one of the assignments was to develop an interactive map for the SDSU campus in Processing. We were given the satellite image of the campus to use as the base map. I thought implementing the zooming and panning functions would be something easy given the core functions in Processing, but it turned out I was wrong (just like me on a lot of other things) , or at least, it is not as easy as I thought to implement a decent version of them.

Here are my expectations for the zooming and panning functions:
– The classic mouse click-and-drag will pan the map;
– Scrolling the mouse wheel upward or downward will zoom in/out;
– The focus point of zooming in/out should be where the cursor is pointing at;
(Thanks to online mapping services like Google Map, the points above can be summarized as Google-Map-like zoom and pan)
– The map should always fully occupies the display window. In other words, the background behind the map should never be visible.

There are a few code examples for map/image zoom and pan in Processing (see herehere, and here). But they are either too old (i.e., Processing 1.x), or not meeting all my four expectations (maybe I was being too greedy..?).. So, I’m sharing my own take here, and hope you don’t have to deal with it yourself — hence, save more time and energy to develop some more amazing interactive map features.

A gif demo of the map zoom and pan in Processing (Click if not playing).

A gif demo of the map zoom and pan in Processing (Click if not playing).

/*Map Zoom and Pan in Processing
Author: Jinlong Yang
Email: jinlong.yang@mail.sdsu.edu
*/

/*When use the code, you may change
the zoomFactor to have finer/coarser
zooming, or change the maxScale to
limit the most detailed level of zooming
*/

//Define the map vars
PImage map;
int imgW;
int imgH;

int centerX;
int centerY;

//Define the zoom vars
int scale = 1;
int maxScale = 10;
float zoomFactor = 0.4;

//Define the pan vars
int panFromX;
int panFromY;

int panToX;
int panToY;

int xShift = 0;
int yShift = 0;

void setup() {
  map = loadImage("image.jpg");

  imgW = map.width;
  imgH = map.height;

  centerX = imgW / 2;
  centerY = imgH / 2;

  size(800, 600);
}

void draw(){
  background(0);
  imageMode(CENTER);
  image(map, centerX, centerY, imgW, imgH);
}

//Pan function
void mousePressed(){
  if(mouseButton == LEFT){
    panFromX = mouseX;
    panFromY = mouseY;
  }
}

//Pan function continued..
void mouseDragged(){
  if(mouseButton == LEFT){
    panToX = mouseX;
    panToY = mouseY;

    xShift = panToX - panFromX;
    yShift = panToY - panFromY;

    //Only pan with the image occupies the whole display
    if(centerX - imgW / 2 <= 0
    && centerX + imgW / 2 >= width
    && centerY - imgH / 2 <= 0
    && centerY + imgH / 2 >= height){
      centerX = centerX + xShift;
      centerY = centerY + yShift;
    }

    //Set the constraints for pan
    if(centerX - imgW / 2 > 0){
      centerX = imgW / 2;
    }

    if(centerX + imgW / 2 < width){
      centerX = width - imgW / 2;
    }

    if(centerY - imgH / 2 > 0){
      centerY = imgH / 2;
    }

    if(centerY + imgH / 2 < height){
      centerY = height - imgH / 2;
    }

    panFromX = panToX;
    panFromY = panToY;
  }
}

//Zoom function
void mouseWheel(MouseEvent event) {
  float e = event.getAmount();

  //Zoom in
  if(e == -1){
    if(scale < maxScale){
      scale++;
      imgW = int(imgW * (1+zoomFactor));
      imgH = int(imgH * (1+zoomFactor));

      int oldCenterX = centerX;
      int oldCenterY = centerY;

      centerX = centerX - int(zoomFactor * (mouseX - centerX));
      centerY = centerY - int(zoomFactor * (mouseY - centerY));
    }
  }

  //Zoom out
  if(e == 1){
    if(scale < 1){
      scale = 1;
      imgW = map.width;
      imgH = map.height;
    }

    if(scale > 1){
      scale--;
      imgH = int(imgH/(1+zoomFactor));
      imgW = int(imgW/(1+zoomFactor));

      int oldCenterX = centerX;
      int oldCenterY = centerY;

      centerX = centerX + int((mouseX - centerX)
      * (zoomFactor/(zoomFactor + 1)));
      centerY = centerY + int((mouseY - centerY)
      * (zoomFactor/(zoomFactor + 1)));

      if(centerX - imgW / 2 > 0){
        centerX = imgW / 2;
      }

      if(centerX + imgW / 2 < width){
        centerX = width - imgW / 2;
      }

      if(centerY - imgH / 2 > 0){
        centerY = imgH / 2;
      }

      if(centerY + imgH / 2 < height){
        centerY = height - imgH / 2;
      }
    }
  }
}

The code snippet above is also available at my GitHub. As a closing note, this code snippet is not working if you want to deploy a web version using the Processing.js. I will update this post if I figure out how to do it in JavaScript.

Visualizing Yelp User Check-ins with R

I came across a dataset released by Yelp for the Yelp Dataset Challenge, and I feel it’s fun to create an animated map to visualize the spatio-temporal patterns of user check-in records. Specifically, I want to make an animated map that visualizes the hourly check-in pattern within a day, which is kind of like the pulse of city reflected by Yelp users.

The dataset is a subset of Yelp’s data from the greater Phoenix, AZ metropolitan area, including business info, custom reviews, user check-in records, and user info. The dataset comes in four json files and can be download from here.

I wrote a script in R that parses the following information for each check-in record from the business.json and check-in.json:
– long;
– lat;
– hour;
– day of the week;
– number of check-ins within that hour;

And that gives me ~200,000 rows with ~800,000 check-in records in a csv file. The csv file (~10M) can be downloaded from this link.

To get started, we need to load the two R packages – ggmap (to retrieve the basemap as a static image) and animation (to save the animated map as a gif file)

#Install&load packages
install.packages("animation")
library("animation")

install.packages("ggmap")
library("ggmap")

Next, read in the csv file that parsed from the original json files.

#Would be much faster if you download and read it from local)
yelp <- read.csv("http://dl.dropboxusercontent.com/u/15662467/blogposts/yelp_checkins.csv", header = T)

#data preview
head(yelp)

Once have the data loaded, we need to the geographic extend of the data as a vector.

#Set the map extent to contain all check-in locations
extent <- c(min(yelp$long), min(yelp$lat), max(yelp$long), max(yelp$lat))

Now use the get_map() function from ggmap to query a static basemap that has the same geographic extend as the data, and then plot it as the basemap using ggmap() function.

Here, I chose the “watercolor” map style (type) from @Stamen as this map style doesn’t include any labels for regions or street names, which will help your readers focus on the trend of check-in patterns by removing all the text on basemap. And of course, I always enjoy the eye-candy design schemes from @Stamen

#Query the basemap from stamen and plot it
basemap <- get_map(location = extent, maptype = "watercolor", source = "stamen")
basemap <- ggmap(basemap, extent = "device")

#Draw the basemap
basemap

Okay, you got the basemap. What is left here is to plot the check-in points onto the basemap on an hourly basis. To plot the hourly check-in patterns programmatically (rather than manually plot the patterns 24 times for 24 hours) , we need some magic from the animation package. First, set up the output path for the gif file. I set it the same as my work dirctiory (getwd()).

ani.options(outdir = getwd())

Before the animation magic started, it’s a good idea to create a set of time labels to tell us which hour we are looking at before we get lost in the eternal animation loops.

#Construct the hourly labels
am <- paste(1:11, "AM", sep = "")
am <- append("12AM", am)
pm <- paste(1:11, "PM", sep = "")
pm <- append("12PM", pm)
hour_label <- append(am, pm)

Now, it’s the time for the animation package to do some magics. As a side note, the way that the animation package works is essentially the same as cel animation – you draw a bunch of images and play them sequentially at a fast pace. In R, you can use for/while loops to draw a set of graphics (frames), and the animation package will just convert them into a gif file in the order that they were being drawn.

#Create the GIF
saveGIF({
  #Set the first frame to the 0 (12am)
  hour<-0
  #Draw the check-in patterns for each hour using a while loop
  while(hour < 24) {
    #Subset the data for this hour
    this_hour <- yelp[yelp$time == hour, ]

    #Compose the map for this hour (basemap + points + time label)
    this_hour_map <- basemap +
    #Set long/lat and adjust the alpha value for each points to mitigate the overplotting problem
    geom_point(aes(x = long, y = lat, alpha = checkins/50),
    #Set the data to be plotted
    data = this_hour, col = "purple", size = 0.8) +
    #Plot the black shadow for time label
    geom_text(data = NULL, x = -111.45 + 0.003, y = 33.9 - 0.003,
              label = hour_label[hour + 1], color = "black",
              face = "bold", cex = 10) +
    #Plot the time label
    geom_text(data = NULL,x = -111.45, y = 33.9,
              label = hour_label[hour + 1], color = "#9A413B",
              face = "bold", cex = 10) +
    #Remove axis, ticks, etc.
    theme(axis.line=element_blank(), axis.text.x=element_blank(),
    axis.text.y=element_blank(), axis.ticks=element_blank(),
    axis.title.x=element_blank(), axis.title.y=element_blank(),
    panel.border = element_blank(),plot.background=element_blank(),
    legend.position="none")

    #Plot the map composed above
    plot(this_hour_map, xaxs = "i", yaxs = "i")

    #Increment to next hour
    hour <- hour + 1

    #Keep track of the current progress
    print(hour)
  }
}, movie.name = "yelp_hourly.gif", interval = 0.6, ani.width = 600, ani.height = 600)

Now, go to the path you set earlier and you will find the gif (shown below)

An animated map that shows the hourly check-in patterns of Yelp users in Phoenix, AZ (click for a better view).

An animated map that shows the hourly check-in patterns of Yelp users in Phoenix, AZ (Click for a better view).

Some reflection:
– The major problem with this map is overplotting because we are plotting ~800,000 check-in records within a fairly small area. I tried to mitigate this problem by assigning an alpha value to each point such that the alpha value for a single check-in would be 0.02. In other words, an opaque point represents a business with more than 50 check-ins. But there are definitely more ways to improve.
– You might have already noticed that the legend is missing for this map. Yes, I intentionally removed the default legend by setting “legend.position” to be “none”. You may switch the legend back, but the problem is that the legend is placed at the right side of the map by default, and it’s only possible to move it to top/bottom/left side, which will create white margins. So I decided to rather remove the legend as the main info I wanted to communicate here is the spatio-temporal patterns of the check-ins, and absolute numbers of check-ins is not essential here.
– To add some analytic aspect to this map, it would be nice to superimpose an animated histogram at the corner of the map, which is synchronized to highlight the number of check-ins for current hour.