Dijkstra Takes the Bus

My first post is a show-and-tell. This is a very small project I did a few years ago -- an illustration of Dijkstra's algorithm using transit data from the CTA in Chicago.

Dijkstra's algorithm is very straightforward to code, and I thought it would be neat to pair it with a visualization and with Chicago transit stops. There's quite a bit of data that needs to be downloaded, and I find that the animation works better in Google Chrome. With those caveats:

Click to load

Each dot on the map is a CTA stop. Click a stop to calculate all the travel times from that stop. After the computation begins, click another bus stop to see the current shortest path. Colors expand outward as the algorithm explores the network of bus stops in the city, finding paths and updating them. Starting from my house at 8:00am on a weekday, where can I get to in the next hour?

The legend in the upper right shows how stops are colored -- from green to red as they get further away from the origin -- and the white bar below the legend shows the progress of the graph search process.

I think this is a neat illustration of Dijkstra's algorithm, but it's not intended to actually be a trip planner (for one thing, the data is from a few years ago.) A nice side effect of doing Dijkstra along with a visualization is that you end up with an access map -- a view into how much of the city you can reach in a given amount of time, computed by your browser in realtime.

More about the algorithm

Dijkstra's algorithm is a way to iteratively discover the shortest paths in a graph, given a starting point. In pseudocode, it looks like this:

Given a graph G with nodes V and edges E:

Let v = starting node,
    S = the unvisited nodes = V,
    D(w) = distance from v to w, initially infinity for all w != v. D(v) = 0,
    N(w) = neighbors of w = { x : (w, x) in E },
    length(x, y) = length of the edge between x and y.

while S is not empty:
    w = the node in S with the smallest distance from v (initially, v).
    for each x in N(w) where x is unvisited:
        set D(x) = min(D(x), D(w) + length(w,x))
    Mark w visited and remove w from S.

Conceptually, the algorithm is simple. We start knowing nothing, except that the closest node to v is v itself. We look at nodes right next to v, and update what we know. Then, we move to nodes close to those nodes, rinse, lather, and repeat. We keep going until we find we've explored the entire graph.

Each node on the map is a transit stop. A node's neighbors can be stops that are further along on a route, or bus stops that are "walking distance." I'm using a dumb approximation for the amount of time to walk somewhere -- we calculate the distance between two points, as the crow flies, and then say we can walk that at 2 mph. Stops are only walkable if they're within a mile of each other.

Technical notes

The data comes from the CTA's General Transit Feed Specification, available here. The only pre-processing is that stop_times.txt was restricted to only arrivals and departures between 8:00am and 9:00am.

This project is built with D3 and heap, a Javascript priority queue implementation. It builds on my d3-mapzoom plugin, and my experiments with D3/Dijkstra implementations here and here.

This project is on Github here.