Using pgRouting to produce a tour of all MLB stadiums

The Traveling Salesperson Problem (TSP) is a well known problem with many applications.
One such application could be visiting all the baseball stadiums in Major League Baseball.
These type of trips (or subsets of it) are the passion of many people, and there are even websites devoted to planning the routes (disclaimer: I am the author of that fine site).

I was looking for an application to illustrate the functionality of the pgRouting module in PostgreSQL, and due to my familiarity with the baseball road trip, this seemed like a good fit.  Well, sort of…..note that one twist in visiting a baseball stadium on a roadtrip is that there is not always a game happening and ostensibly you would want to visit a stadium only when there was one. However , for this illustration, we will just try and calculate the best route for visiting all the parks exactly once regardless of whether there is a game happening. (again , go to BasesRoaded.com to do it only when games are being played !)

So without further ado, let’s tackle this with pgRouting.

PRELIMINARIES:

Before we are going to calculate anything based on the distances between stadiums, we would normally need  get the distances between all the stadiums. What we are calculating here is what is known as a distance matrix. For each of the 30 MLB teams, it will contain the distance from its stadium to each of the other 29 stadiums.  However, in this implementation, we will have pgRouting calculate the distances so we will skip this and only get the latitudes and longitudes of the stadiums.

Since I already have these steps completed due to my site BasesRoaded.com,  I will use them here. But in general, the steps to get them was something like:
1. Stadiums: Prepare list of all 30 stadiums.
2. Geocode:  Geocode this list to get the associated latitude and longitude for each stadium.
3. Directions: Use a direction API like Google or Mapquest to calculate the distances between all the stadiums. Note that we will not be doing this step here, as we will let pgRouting calculate the Euclidean distances using the lat/longs of the stadiums.

CREATE TABLE stadium
(
id smallint NOT NULL,
name character varying(50) NOT NULL,
"long" numeric(9,6) NOT NULL,
lat numeric(9,6) NOT NULL,
)

pgRouting:

I had some issues that I assume were my failing, but in the end I was able to use pgRouting to produce a tour of all the MLB stadiums. I was running PgRouting 2.2.2 within PostgreSQL 9.5.  In particular, working within pgAdmin I was not able to successfully nest  the _pgr_makeDistanceMatrix  function within the pgr_tsp call using a nested SQL call. I think this may have been due to escaping (or not escaping) single quotes,  but I wasn’t able to
figure that out. So rather than producing a distance matrix via SQL and essentially piping it into pgr_tsp, I produced the distance matrix, inserted into a table and then cut and paste the value into a pgr_tsp call.

Create and store the distance matrix

The following SQL will insert exactly one row into dm_tbl. The first column of the insert will be an NxN distance matrix.

insert into dm_tbl
Select dmatrix,ids from (
SELECT dmatrix,ids from _pgr_makeDistanceMatrix('SELECT id::int4, long::float8 x,lat::float8 y FROM stadium order by id')
) dm1

Run pgr_tsp to produce the tour

The pgRouting documentation had the example below of running pgr_tsp with a hard coded distance matrix. I just cut and pasted my distance matrix from the table above and
overlaid the simpler version below.  I had a hard time doing this from within pgAdmin, likely due to the size of pasting a large distance matrix into the SQL window. So I created a version of the following with the full distance matrix in a file and executed from the psql command line.

SELECT seq, id FROM pgr_tsp('{{0,1,2,3},{1,0,4,5},{2,4,0,6},{3,5,6,0}}'::float8[],1);

SELECT seq, id FROM pgr_tsp('INSERT FULL DISTANCE MATRIX HERE'::float8[],1);

Here are the results from the pgr_tsp call. After this, we will map the route using QGIS.

RESULTS:

seq | name
—–+——————————
0 | Wrigley Field
1 | Guaranteed Rate Field
2 | Comerica Park
3 | Progressive Field
4 | Rogers Centre
5 | Fenway Park
6 | Citi Field
7 | Yankee Stadium
8 | Citizens Bank Park
9 | Oriole Park at Camden Yards
10 | Nationals Park
11 | PNC Park
12 | Great American Ball Park
13 | SunTrust Park
14 | Marlins Park
15 | Tropicana Field
16 | Minute Maid Park
17 | Globe Life Park in Arlington
18 | Chase Field
19 | PETCO Park
20 | Angels Stadium of Anaheim
21 | Dodger Stadium
22 | OAC Coliseum
23 | ATT Park
24 | Safeco Field
25 | Coors Field
26 | Kauffman Stadium
27 | Busch Stadium
28 | Target Field
29 | Miller Park

 

Here is a map of the route generated in QGIS. The routes shown are actual driving routes. Note that this is not a closed loop as it doesn’t return to the starting point of Chicago from the final point of Milwaukee (though it would be easy enough to do so).