Skip to content

Commit

Permalink
added Knuth miles data and notebook
Browse files Browse the repository at this point in the history
  • Loading branch information
chapmanbe committed Nov 22, 2017
1 parent 01f2fb4 commit efe9533
Show file tree
Hide file tree
Showing 2 changed files with 352 additions and 0 deletions.
352 changes: 352 additions & 0 deletions modules/m16_graphs/InClass/KnuthMiles.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,352 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# A Colorful example of using NetworkX\n",
"## From the NetworkX Documentation\n",
"* Visualize size and distances between cities\n",
"* Uses 'miles_dat.txt.gz'\n",
"* Stores city location and population as attributes of graph\n",
"* Nodes are city names"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import warnings; warnings.simplefilter('ignore')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import matplotlib.pyplot as plt\n",
"import networkx as nx\n",
"import re\n",
"import sys\n",
"import gzip\n",
"import copy\n",
"DATADIR = os.path.join(\"..\",\"Resources\")\n",
"os.path.exists(DATADIR)\n",
"import random"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### An example using networkx.Graph().\n",
"\n",
"miles_graph() returns an undirected graph over the 128 US cities from\n",
"the datafile miles_dat.txt. The cities each have location and population\n",
"data. The edges are labeled with the distance betwen the two cities.\n",
"\n",
"This example is described in Section 1.1 in Knuth's book [1,2].\n",
"\n",
"References.\n",
"\n",
"\n",
"1. Donald E. Knuth,\n",
" \"The Stanford GraphBase: A Platform for Combinatorial Computing\",\n",
" ACM Press, New York, 1993.\n",
"1. [Knuth Stanford Graph Database](http://www-cs-faculty.stanford.edu/~knuth/sgb.html)\n",
"\n",
"\n",
"### Aric Hagberg ([email protected])\n",
"\n",
"Copyright (C) 2004-2006 by \n",
"Aric Hagberg <[email protected]> \n",
"Dan Schult <[email protected]> \n",
"Pieter Swart <[email protected]> \n",
"All rights reserved. \n",
"BSD license. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Description\n",
"\n",
"This notebook provides a nice illustration of a very basic **undirected** graphic: Cities are nodes in the graphs and connections between cities are edges. To keep the graph \"pretty\" we are not providing edges between all the cities.\n",
"\n",
"#### Drawing Graphs\n",
"\n",
"NetworkX has a lot of built in drawing capabilities based on [``matplotlib``](http://matplotlib.org/). Defaults probably won't be very pretty, but if you delve into the details you can customize the drawing a lot. \n",
"\n",
"#### What does our data look like?\n",
"\n",
"For each City we have:\n",
"\n",
"* The name of the city\n",
"* The coordinates of the city\n",
"* The population of the city\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Functions and Methods in ``draw_graph``\n",
"\n",
"* ``plt.figure``: This creates a matplotlib figure. In the function we are specifiying the size of the figure (in inches?)\n",
"* In graph theory the [degree of a node (vertex)](https://en.wikipedia.org/wiki/Degree_(graph_theory)) is the number of edges connected to that node. NetworkX provides a function to return the degree of a node."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def draw_graph(H, plot_name=\"plot.png\"):\n",
" \"\"\"\n",
" \n",
" \"\"\"\n",
"\n",
" plt.figure(figsize=(12,12))\n",
" # with nodes colored by degree sized by population\n",
" # List comprehension\n",
" node_color=[float(H.degree(v)) for v in H]\n",
" nx.draw(H,H.position,\n",
" node_size=[H.population[v] for v in H],\n",
" node_color=node_color,alpha=0.4,\n",
" with_labels=True,\n",
" font_size=8)\n",
"\n",
" # scale the axes equally\n",
" plt.xlim(-5000,500)\n",
" plt.ylim(-2000,3500)\n",
"\n",
" plt.savefig(plot_name)\n",
" plt.show()\n",
" "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### NetworkX methods and functions in ``miles_graph``\n",
"\n",
"Graphs are objects and so we can add attributes to them. In this function we add two dictionaries to the graph ``G``: ``position`` and ``population``.\n",
"\n",
"Each graph also has a dictionary named ``graph`` that attributes could have been added to.\n",
"\n",
"* ``g.add_edge``\n",
"* ``g.add_node``"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"demo = nx.Graph()\n",
"print(demo.graph)\n",
"demo.graph[\"population\"] = {}\n",
"demo.graph[\"position\"] = {}\n",
"print(demo.graph)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def miles_graph(fname):\n",
" \"\"\" Return the cites example graph in fname\n",
" from the Stanford GraphBase.\n",
" \"\"\"\n",
" # open file miles_dat.txt.gz (or miles_dat.txt)\n",
" G=nx.Graph()\n",
" G.position={}\n",
" G.population={}\n",
"\n",
" cities=[]\n",
" with gzip.open(fname,'r') as fh:\n",
"\n",
"\n",
" for line in fh.readlines():\n",
" line = line.decode()\n",
" if line.startswith(\"*\"): # skip comments\n",
" continue\n",
"\n",
" numfind=re.compile(\"^\\d+\")\n",
"\n",
" if numfind.match(line): # this line is distances\n",
" dist=line.split()\n",
" for d in dist:\n",
" G.add_edge(city,cities[i],weight=int(d))\n",
" i=i+1\n",
" else: # this line is a city, position, population\n",
" i=1\n",
" (city,coordpop)=line.split(\"[\")\n",
" cities.insert(0,city)\n",
" (coord,pop)=coordpop.split(\"]\")\n",
" (y,x)=coord.split(\",\")\n",
"\n",
" G.add_node(city)\n",
" # assign position - flip x axis for matplotlib, shift origin\n",
" G.position[city]=(-int(x)+7500,int(y)-3000)\n",
" G.population[city]=float(pop)/1000.0\n",
" return G"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### NetworkX methods and functions in ``main``\n",
"\n",
"NetworkX has a number of functions that allow us to ask basic questions about the graph such as:\n",
"\n",
"* [``nx.number_of_nodes(g)``:](https://networkx.github.io/documentation/latest/reference/generated/networkx.DiGraph.number_of_nodes.html) Returns the number of nodes in the graph ``g``\n",
"* [``nx.number_of_edges(g)``:](https://networkx.github.io/documentation/latest/reference/generated/networkx.classes.function.number_of_edges.html?highlight=number_of_edges) Returns the number of edges in the graph ``g``\n",
"* [``g.edges(data=True)``:](https://networkx.github.io/documentation/latest/reference/generated/networkx.classes.function.edges.html?highlight=edges#networkx.classes.function.edges) This method of the graph ``g`` returns a list of tuples where each tuple describes an edge.\n",
" * If ``data=True`` then the tuple is a 3-tuple ``(u,v,d)`` where ``u`` and ``v`` are the nodes defining the endpoints of the edge and ``d`` is a **dictionary** of the attributes (data) associated with that edge.\n",
" * If ``data=True`` then the tuple is a 2-tuple ``(u,v)``\n",
"* [``g.remove_edge(u,v)``:](https://networkx.github.io/documentation/latest/reference/generated/networkx.Graph.remove_edge.html?highlight=remove_edge) Removes the edge between ``u`` and ``v`` in the graph. If that edge is not present, raises an exception."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def main(fname, max_dist = 300, plot_name = \"knuth_miles\"):\n",
" \"\"\"\n",
" \n",
" \"\"\"\n",
" G=miles_graph(fname)\n",
"\n",
" print(\"City graph has %d nodes and %d edges\"\\\n",
" %(nx.number_of_nodes(G),nx.number_of_edges(G)))\n",
"\n",
"\n",
" # make new graph of cites, edge if less then 300 miles between them\n",
" print(type(G.edges()))\n",
" for (u,v,d) in G.edges(data=True):\n",
" if d['weight'] >= max_dist:\n",
" G.remove_edge(u,v)\n",
"\n",
" # draw with matplotlib/pylab\n",
"\n",
" draw_graph(G,plot_name = \"%s_%04d.png\"%(plot_name,max_dist))\n",
" return G"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"city_graph = main(os.path.join(DATADIR,'miles_dat.txt.gz'),max_dist=300)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Shortest Paths through USA Cities"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"cities = city_graph.nodes()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"cities"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"origin = random.choice(cities)\n",
"dest = random.choice(cities)\n",
"print(\"%s -> %s\"%(origin, dest))\n",
"weighted_path = nx.shortest_path(city_graph, origin, dest, weight=\"weight\")\n",
"unweighted_path = nx.shortest_path(city_graph, origin, dest)\n",
"print(weighted_path)\n",
"print(\"*\"*42)\n",
"print(unweighted_path)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## How could we compute the actual distance along the path?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
Binary file added modules/m16_graphs/Resources/miles_dat.txt.gz
Binary file not shown.

0 comments on commit efe9533

Please sign in to comment.