/** This class contains utilities for representing a graph (in the sense of a data structure having nodes and edges) and finding shortest paths through it. Each node is an arbitrary data structure that is Hashable (that is, can be put into a set.) Thus, a node can be a string, an integer, an array, or other data types, whichever is convenient. Each node should be a unique identifier. A Graph can be simply constructed by constructing a Graph and adding edges (bidirectional or directed). The nodes/vertices will be automatically constructed. g = new Graph g.addBidiEdge["A", "B", 5] g.addBidiEdge["B", "C", 2] g.addDirectedEdge["A", "D", 2] THINK ABOUT: Some of these algorithms, like those for finding the shortest path are dependent on the weights of the edges. For example, Dijkstra's algorithm cannot be used on a graph with negative edge weights, in which case you would use Bellman-Ford's algorithm. If the edges all have the same weight, then a breadth-first search would be faster than Dijkstra's algorithm. If you want to find all of the shortest distances between all pairs of points, you might want to use Floyd's algorithm, which can work for negative-cost edges but not negative-cost cycles. This code could keep track if the weights meet these criteria and automatically dispatch to different algorithms for different graphs. THINK ABOUT: Should the Graph class contain a flag to track if a graph contains directed and/or bidirectional edges? Querying this rapidly might useful. THINK ABOUT: Should the Graph class contain a flag to track if the weights of edges of a graph are: 1.) all the same 2.) all positive 3.) all integers 4.) all floats? This would allow, say, the shortestPath functions to dispatch to the fastest appropriate algorithm or dispatch to a Java-compiled faster "helper" algorithm (as is done for Floyd's algorithm if all edge weights are integers.) See GraphTest.frink for samples of solvers for various problems. Some of the implementations of these algorithms come from Steven S. Skiena, _The Algorithm Design Manual_, Second Edition. References to Skiena are to this book. */ class Graph { /** A set of nodes defined in the graph. */ var nodes = new set /** A dict> of out GraphEdges from each node. */ var outEdges = new dict /** These are constants defining if a node is undiscovered, discovered, or processed. The numerical constants are defined in the order they should be processed so you can do greater-than, less-than, less-than-or equals, etc. */ /** A undiscovered node. */ class var STATE_UNDISCOVERED = 0 /** A discovered node. */ class var STATE_DISCOVERED = 1 /** A processed node. */ class var STATE_PROCESSED = 2 /** Default constructor. This creates an empty graph. */ new[] := { } /** Copy constructor. */ new[orig is Graph] := { this.nodes = deepCopy[nodes] this.outEdges = deepCopy[outEdges] } /** Construct a new Graph from an adjacency matrix. The adjacency matrix should be a square matrix of [from, to] indices where the value at each cell is the weight of going from "from" to "to". undefValue indicates the value that is in the array if there is no connection between the nodes. For example, some adjacency matrixes represent no connection as 0 while others represent it as -1, some will represent it as undef, etc. This will identify each node as an integer from 0 to width-1. See GraphTest.frink for an example of its usage. */ class fromDirectedAdjacencyMatrix[matrix, undefValue] := { d = matrix.dimensions[] dim = d@0 if length[d] != 2 or d@0 != d@1 { println["Error: Graph.fromAdjacencyMatrix: adjacency matrix is not square."] } graph = new Graph for fromIdx = 0 to dim-1 for toIdx = 0 to dim-1 { weight = matrix@fromIdx@toIdx if weight != undefValue graph.addDirectedEdge[fromIdx, toIdx, weight] } return graph } /** Construct a new Graph from an adjacency matrix. The adjacency matrix should be a square matrix of [from, to] indices where the value at each cell is the weight of going from "from" to "to". undefValue indicates the value that is in the array if there is no connection between the nodes. For example, some adjacency matrixes represent no connection as 0 while others represent it as -1, some will represent it as undef, etc. This will identify each node as an integer from 0 to width-1. See GraphTest.frink for an example of its usage. */ class fromBidiAdjacencyMatrix[matrix, undefValue] := { d = matrix.dimensions[] dim = d@0 if length[d] != 2 or d@0 != d@1 { println["Error: Graph.fromAdjacencyMatrix: adjacency matrix is not square."] } graph = new Graph for fromIdx = 0 to dim-1 for toIdx = fromIdx+1 to dim-1 { weight = matrix@fromIdx@toIdx if weight != undefValue graph.addBidiEdge[fromIdx, toIdx, weight] else { weight = matrix@toIdx@fromIdx if weight != undefValue graph.addBidiEdge[fromIdx, toIdx, weight] } } return graph } /** Construct a new graph from a 2-dimensional array of weights where each node indicates a numeric weight that is incurred when entering the node in the grid. For example, this can be used to solve 4-directional mazes. undefValue indicates the value that is in the array if there is no connection between the nodes. For example, some adjacency matrixes represent no connection as 0 while others represent it as -1, etc. Each node is a [row, col] array where the row and col are zero-based integers. So, the top-left node is the array [0, 0]. See GraphTest.frink for examples of its usage. */ class from4WayGrid[grid, undefValue] := { [rows, cols] = grid.dimensions[] graph = new Graph for row = 0 to rows-1 for col = 0 to cols-1 { weight = grid@row@col if weight != undefValue { if row > 0 graph.addDirectedEdge[[row-1, col], [row, col], weight] if col > 0 graph.addDirectedEdge[[row, col-1], [row, col], weight] if row < rows-1 graph.addDirectedEdge[[row+1, col], [row, col], weight] if col < cols-1 graph.addDirectedEdge[[row, col+1], [row, col], weight] } } return graph } /** Returns a set of out GraphEdges for the specified node. */ getOutEdges[node] := { return outEdges@node } /** Returns a set of all bidirectional edges in the graph. Each element of the set is an [node1, node2, weight] array. The nodes are ordered by the hashcode of the nodes since node identifiers are not required to be comparable to each other. */ getBidiEdges[] := { result = new set for node1 = nodes for edge = getOutEdges[node1] { if edge.isBidi[] { node2 = edge.getOutNode[] if hashCode[node1] <= hashCode[node2] result.put[ [node1, node2, edge.getWeight[]] ] else result.put[ [node2, node1, edge.getWeight[]] ] } } return result } /** Returns a set of the directed-only edges in the graph. This will not return bidirectional edges. Each element of the set is an [fromNode, toNode, weight] array. */ getDirectedEdges[] := { result = new set for node1 = nodes for edge = getOutEdges[node1] if edge.isDirected[] result.put[ [node1, edge.getOutNode[], edge.getWeight[]] ] return result } /** Returns an set of all edges in the graph. Each element of the set is an [inNode, outNode, weight] array. A bidirectional edge is represented as two edges, one in each direction. If you want to know if an edge is bidirectional, and only handle it once, see getAllEdgesWithFlags[] below. */ getAllEdges[] := { result = new set for node1 = nodes for edge = getOutEdges[node1] result.put[ [node1, edge.getOutNode[], edge.getWeight[]] ] return result } /** Returns an set of all edges in the graph. Each element of the set is an [inNode, outNode, weight, bidi] array. A bidirectional edge is represented as a single edge with the bidi flag set to true. A directed edge is represented as a single edge with the bidi flag set to false. */ getAllEdgesWithFlags[] := { result = new set for [node1, node2, weight] = getBidiEdges[] result.put[ [node1, node2, weight, true ] ] for [node1, node2, weight] = getDirectedEdges[] result.put[ [node1, node2, weight, false ] ] return result } /** Adds a node (possibly without any out edges) to this graph. */ addNode[node] := { nodes.put[node] } /** Gets the weight of the edge from fromNode to toNode or undef if it does not exist. */ getWeight[fromNode, toNode] := { for edge = outEdges@fromNode if edge.getOutNode[] == toNode return edge.getWeight[] return undef } /** Adds a bidirectional edge between node1 and node2 to this graph (and makes sure the nodes are added to the collection of nodes.) */ addBidiEdge[node1, node2, weight=1] := { addNode[node1] addNode[node2] outEdges.addToSet[node1, new GraphEdge[node2, weight, true]] outEdges.addToSet[node2, new GraphEdge[node1, weight, true]] } /** Add a directional (one-way) edge between node1 and node2 to this graph (and making sure the nodes are added to the collection of nodes.) */ addDirectedEdge[node1, node2, weight=1] := { addNode[node1] addNode[node2] outEdges.addToSet[node1, new GraphEdge[node2, weight, false]] } /** Returns a set of all the nodes in this graph. */ getNodes[] := { return nodes } /** Returns the number of nodes in the graph. */ getNodeCount[] := { return length[nodes] } /** Returns a set of nodes that are not connected to any other nodes in the graph. */ getDisconnectedNodes[] := { connected = new set for [node, edges] = outEdges { connected.put[node] for edge = edges connected.put[edge.getOutNode[]] } return setDifference[nodes, connected] } /** Invert this graph by creating a duplicate of this graph with directed edges pointing in the opposite direction. Bidirectional edges are duplicated and still pointing in both directions. */ invert[] := { g = new Graph for node = nodes { g.addNode[node] for edge = getOutEdges[node] { if edge.isBidi[] g.addBidiEdge[node, edge.getOutNode[], edge.getWeight[]] else // Flip edge g.addDirectedEdge[edge.getOutNode[], node, edge.getWeight[]] } } return g } /** This calculates all shortest paths for all pairs of nodes in the graph using Floyd's algorithm. This implements a solver for Floyd's algorithm, also called Floyd-Warshall which finds *all* the shortest distances through a graph between all pairs of nodes in a graph. It runs in O(n^3) time which might sound theoretically no better than n calls to Dijkstra's algorithm (which is O(n^2) and only finds the best path between two nodes) but these loops are extremely simple and tight, which makes it efficient. Floyd's algorithm is also much more efficient than n calls to Dijkstra's algorithm because of the highly-repeated nature of shortest paths through a graph. However, the algorithm as written only finds the *lengths* of the paths between nodes and not the paths themselves, which is sometimes all you need, which also makes it much faster than an implementation of n calls to Dijkstra's algorithm that also finds the paths. See The Algorithm Design Manual, Stephen Skiena, section 6.3.2. (but don't trust Skiena's implementation or advice on using MAXINT.) The actual implementation of this delegates to a highly-optimized Java version that assumes that edges have integer weights: https://futureboy.us/temp/Floyd.java The result value is a GraphAllDistances class (see below) which has accessor methods to find the distances between pairs of values or turn the results into a dictionary. If you only need all the distances from *one* node to all other reachable nodes (and not all pairs of nodes), use the allShortestDistances[startNode] method which takes a startNode argument and uses Dijkstra's algorithm. */ allShortestDistances[] := { // This calls a highly-optimized version of Floyd's algorithm in Java // that assumes weights are integers. It modifies the adjOut matrix // in place. [indexToNode, nodeToIndex, adjOut] = toIntAdjacencyMatrix[-1] callJava["frink.function.Floyd", "shortestDistances", [adjOut, -1]] return new GraphAllDistances[indexToNode, nodeToIndex, adjOut] } /** Makes a 2-dimensional Java array of integer weights representing an adjacency graph of [from][to] weights. */ toIntAdjacencyMatrix[disconnectedValue] := { i = 0 indexToNode = new dict nodeToIndex = new dict for n = nodes { indexToNode@i = n nodeToIndex@n = i i = i + 1 } adj = newJavaArray["int", [i,i], disconnectedValue] for [node1, node2, weight] = getBidiEdges[] { i1 = nodeToIndex@node1 i2 = nodeToIndex@node2 adj@i1@i2 = weight adj@i2@i1 = weight } for [node1, node2, weight] = getDirectedEdges[] { i1 = nodeToIndex@node1 i2 = nodeToIndex@node2 adj@i1@i2 = weight } return [indexToNode, nodeToIndex, adj] } /** This inverts the toIntAdjacencyMatrix call and returns a dictionary where the key is an array [node1,node2] and the value is the weight. */ class fromIntAdjacencyMatrix[indexToNode, nodeToIndex, adj, disconnectedValue] := { size = length[adj] dist = new dict for row = 0 to size-1 { rnode = indexToNode@row for col = 0 to size-1 { cnode = indexToNode@col val = adj@row@col if val != disconnectedValue dist@([rnode,cnode]) = val } } return dist } /** Calculate the shortest path from startNode to endNode using Dijkstra's algorithm. Weights of edges cannot be negative. If edge weights are negative, you will need to use findShortestPathBellmanFord. The result value is a GraphPath (see below) which contains the shortest path as a list of nodes and a shortest distance. If no path is found betwen startNode and endNode, this returns undef. */ findShortestPath[startNode, endNode] := { // Degenerate case, bail out quickly if startNode == endNode return GraphPath.makeDegeneratePath[startNode] // Sanity check to ensure that the graph actually *contains* the start // and end node. if ! nodes.contains[startNode] { println["Graph.findShortestPath: Graph does not contain start node " + startNode] return undef } if ! nodes.contains[endNode] { println["Graph.findShortestPath: Graph does not contain end node " + endNode] return undef } /** A set of nodes that have been visited. */ visited = new set /** A dictionary of distances from node1 to the specified node. The key is the specified node and the value is the distance to that node. */ distance = new dict /** This is an OrderedList list of the nodes that have already been processed and their distances from the start node. It is kept in order with the smallest distances at the end so they will get popped efficiently. Each element is an array [node, dist] pair. TODO: Implement a heap (perhaps a Fibonacci heap or binary heap) for faster insertion? (Update: This was tested and made no appreciable improvement over OrderedList.) */ shortestList = new OrderedList[{|a,b| b@1 <=> a@1}] /** This is a dictionary of previous nodes for each node. The key is the node and the value is the previous node that has the shortest path to this node. This is eventually used to find the shortest path by enumerating (backwards) from endNode to startNode. */ previousNodes = new dict // Initialize the distance to the start node to 0. distance@startNode = 0 // currNode represents the current node we are visiting. currNode = startNode VISITED: while ! visited.contains[currNode] // While current node is not visited { visited.put[currNode] // Mark the current node as visited // Find the shortest already-found distance from startNode to currNode currDist = distance@currNode for outEdge = outEdges@currNode { // The prospective node we're examining nextNode = outEdge.getOutNode[] // The weight from currNode to nextNode weight = outEdge.getWeight[] // The total distance from startNode to the next node potentialDist = currDist + weight // Look up shortest distance fron start to nextNode, if it exists nextDist = distance@nextNode // Is the other node unchecked (which means nextDist should be // treated as infinite) or this new distance shorter? if nextDist == undef or potentialDist < nextDist { // Update the shortest-possible distance to nextNode distance@nextNode = potentialDist previousNodes@nextNode = currNode shortestList.insert[ [nextNode, potentialDist] ] } } // We have reached the end node. We can bail out early. // THINK ABOUT: Should we continue calculating and cache all of the // results which will return all of the shortest distances // (in the variable distance) to all nodes from startNode? if currNode == endNode break VISITED // Never found and we have no more nodes to evaluate. if shortestList.isEmpty[] return undef // Pop the node with the next shortest distance currNode = (shortestList.pop[])@0 } return GraphPath.makeFromPredecessors[previousNodes, startNode, endNode, distance@endNode] } /** Calculate all the shortest distances (but not paths) from startNode to all the reachable nodes in the graph using Dijkstra's algorithm. Weights of edges cannot be negative. The algorithm only finds the *lengths* of the paths between nodes and not the paths themselves, which is sometimes all you need, The result value is a dict of pairs. If no path is found betwen startNode and endNode, the dictionary will not have a mapping and querying the dictionary for that node will return undef. If you need all the distances between *all* pairs of nodes, Floyd's algorithm with no arguments allShortestDistances[] will be faster than calling this n times. */ allShortestDistances[startNode] := { // Sanity check to ensure that the graph actually *contains* the start // node. if ! nodes.contains[startNode] { println["Graph.allShortestDistances: Graph does not contain start node " + startNode] return undef } /** A set of nodes that have been visited. */ visited = new set /** A dictionary of distances from node1 to the specified node. The key is the specified node and the value is the distance to that node. */ distance = new dict /** This is an OrderedList list of the nodes that have already been processed and their distances from the start node. It is kept in order with the smallest distances at the end so they will get popped efficiently. Each element is an array [node, dist] pair. TODO: Implement a heap (perhaps a Fibonacci heap or binary heap) for faster insertion? (Update: This was tested and made no appreciable improvement over OrderedList.) */ shortestList = new OrderedList[{|a,b| b@1 <=> a@1}] // Initialize the distance to the start node to 0. distance@startNode = 0 // currNode represents the current node we are visiting. currNode = startNode VISITED: while ! visited.contains[currNode] // While current node is not visited { visited.put[currNode] // Mark the current node as visited // Find the shortest already-found distance from startNode to currNode currDist = distance@currNode for outEdge = outEdges@currNode { // The prospective node we're examining nextNode = outEdge.getOutNode[] // The weight from currNode to nextNode weight = outEdge.getWeight[] // The total distance from startNode to the next node potentialDist = currDist + weight // Look up shortest distance fron start to nextNode, if it exists nextDist = distance@nextNode // Is the other node unchecked (which means nextDist should be // treated as infinite) or this new distance shorter? if nextDist == undef or potentialDist < nextDist { // Update the shortest-possible distance to nextNode distance@nextNode = potentialDist shortestList.insert[ [nextNode, potentialDist] ] } } // Never found and we have no more nodes to evaluate. if shortestList.isEmpty[] return distance // Pop the node with the next shortest distance currNode = (shortestList.pop[])@0 } return distance } /** This finds the shortest path through a graph using the Bellman-Ford algorithm. Unlike Dijkstra's algorithm, Bellman-Ford will work on graphs with negative weights but not negative-weight cycles. This is probably slower than Dijkstra's algorithm in most cases as it runs in time O(numVertices * numEdges). The result value is a GraphPath (see below) which contains the shortest path as a list of nodes and a shortest distance. If no path is found betwen startNode and endNode, or if there is a negative-weight cycle, this returns undef. */ findShortestPathBellmanFord[startNode, endNode] := { // Degenerate case, bail out quickly if startNode == endNode return GraphPath.makeDegeneratePath[startNode] // Sanity check to ensure that the graph actually *contains* the start // and end node. if ! nodes.contains[startNode] { println["Graph.findShortestPathBellmanFord: Graph does not contain start node " + startNode] return undef } if ! nodes.contains[endNode] { println["Graph.findShortestPathBellmanFord: Graph does not contain end node " + endNode] return undef } /** A dictionary of distances from node1 to the specified node. The key is the specified node and the value is the distance to that node. */ distance = new dict // Initialize the distance to the first node to 0. distance@startNode = 0 /** This is a dictionary of previous nodes for each node. The key is the node and the value is the previous node that has the shortest path to this node. This is eventually used to find the shortest path by enumerating (backwards) from endNode to startNode. */ previousNodes = new dict // Step 2: Relax all edges length[vertices] - 1 times. A simple // shortest path from startNode to any other vertex can // have at-most length[vertices] - 1 edges numVertices = length[nodes] for i = 1 to numVertices-1 { for currNode = nodes { // The distance from startNode to the current node currDist = distance@currNode for currEdge = outEdges@currNode { nextNode = currEdge.getOutNode[] weight = currEdge.getWeight[] if currDist != undef { nextDist = distance@nextNode // The total distance from startNode to the next node potentialDist = currDist + weight // Is the other node unchecked or this new distance shorter? if nextDist == undef or potentialDist < nextDist { // Update the shortest-possible distance to nextNode distance@nextNode = potentialDist previousNodes@nextNode = currNode } } } } } // Step 3: check for negative-weight cycles. The above // step guarantees shortest distances if graph doesn't // contain negative weight cycle. If we get a shorter // path, then there is a cycle. for currNode = nodes { // The distance from startNode to the current node currDist = distance@currNode for currEdge = outEdges@currNode { outNode = currEdge.getOutNode[] weight = currEdge.getWeight[] // Negative cycle found. nextDist = distance@outNode if currDist != undef and nextDist != undef and currDist + weight < nextDist return undef } } return GraphPath.makeFromPredecessors[previousNodes, startNode, endNode, distance@endNode] } /** This finds the shortest path through a graph using a breadth-first search. This only works on graphs with identical weights. If the edges do not have identical weights, you must use Dijkstra's algorithm or, if the graph contains negative weights, you must use Bellman-Ford's algorithm. This works in linear time on a graph with identical weights so it is faster than Dijkstra's algorithm. The result value is a GraphPath (see below) which contains the shortest path as a list of nodes and a shortest distance. If no path is found betwen startNode and endNode, this returns undef. TODO: Check to see that all edges have weight 1 and don't use this algorithm if any weights are different? See Skiena section 5.6 */ findShortestPathBreadthFirst[startNode, endNode] := { // Degenerate case, bail out quickly if startNode == endNode return GraphPath.makeDegeneratePath[startNode] // Sanity check to ensure that the graph actually *contains* the start // and end node. if ! nodes.contains[startNode] { println["Graph.findShortestPathBreadthFirst: Graph does not contain start node " + startNode] return undef } if ! nodes.contains[endNode] { println["Graph.findShortestPathBreadthFirst: Graph does not contain end node " + endNode] return undef } // A queue of nodes to visit. queue = new array // A dictionary containing the states of the nodes. undef should be // treated as STATE_UNDISCOVERED states = new dict queue.push[startNode] states@startNode = STATE_DISCOVERED /** This is a dictionary of previous nodes for each node. The key is the node and the value is the previous node that has the shortest path to this node. This is eventually used to find the shortest path by enumerating (backwards) from endNode to startNode. */ previousNodes = new dict while ! queue.isEmpty[] { currNode = queue.popFirst[] // Bail out early if we've found the end node. if currNode == endNode return GraphPath.makeFromPredecessors[previousNodes, startNode, endNode] states@currNode = STATE_PROCESSED for outEdge = outEdges@currNode { nextNode = outEdge.getOutNode[] nextState = states.get[nextNode, STATE_UNDISCOVERED] if nextState < STATE_DISCOVERED { queue.push[nextNode] states@nextNode = STATE_DISCOVERED previousNodes@nextNode = currNode } } } // No path found and out of nodes to evaluate. return undef } /** Calculate the shortest path from startNode to endNode using the A* (pronounced "A-star") algorithm. This is a heuristic search to which you must provide a heuristic function that estimates the distance to the goal. The function heuristicFunction is a function that takes a node and some optional arbitrary data and returns an estimate of the distance to endNode. For example, a heuristic function that finds the Manhattan distance between a specified node and a target node that is specified as [row,col] in the data can be called as: f = {|node,data| [row, col] = node [trow, tcol] = data return abs[trow-row] + abs[tcol-col] } gp = graph.findShortestPathAStar[ [0,0], [rows-1, cols-1], f, [rows-1,cols-1] ] The result value is a GraphPath (see below) which contains the shortest path as a list of nodes and a shortest distance. If no path is found betwen startNode and endNode, this returns undef. See: https://web.archive.org/web/20171022224528/https://www.policyalmanac.org/games/aStarTutorial.htm https://theory.stanford.edu/~amitp/GameProgramming/ImplementationNotes.html https://www.redblobgames.com/pathfinding/a-star/implementation.html Please note that there is VERY LITTLE SIMILARITY in the implementations listed in the last two links (which are written by the same guy?!) */ findShortestPathAStar[startNode, endNode, heuristicFunction, data=undef] := { // Degenerate case, bail out quickly if startNode == endNode return GraphPath.makeDegeneratePath[startNode] // Sanity check to ensure that the graph actually *contains* the start // and end node. if ! nodes.contains[startNode] { println["Graph.findShortestPathAStar: Graph does not contain start node " + startNode] return undef } if ! nodes.contains[endNode] { println["Graph.findShortestPathAStar: Graph does not contain end node " + endNode] return undef } /** For node n, fScore[n] := distance[n] + h(n). fScore[n] represents our current best guess as to how cheap a path could be from start to finish if it goes through n. */ fScore = new dict /** This is an OrderedList list of the nodes that have already been processed and their distances from the start node. It is kept in order with the smallest distances at the end so they will get popped efficiently. This uses the fScore dictionary to order the items. TODO: Implement a heap (perhaps a Fibonacci heap or binary heap) for faster insertion? */ shortestList = new OrderedList[{|a,b,fDict| fDict@b <=> fDict@a}, fScore] /** A dictionary of distances from startNode to the specified node. The key is the specified node and the value is the distance to that node. This is called gNode in some literature. */ distance = new dict /** A set of closed nodes. */ closed = new set /** This is a dictionary of previous nodes for each node. The key is the node and the value is the previous node that has the shortest path to this node. This is eventually used to find the shortest path by enumerating (backwards) from endNode to startNode. */ previousNodes = new dict // Initialize the distance to the first node to 0. distance@startNode = 0 fScore@startNode = heuristicFunction[startNode, data] shortestList.insert[ startNode ] while ! shortestList.isEmpty[] { currNode = shortestList.pop[] // Found solution, bail out. if currNode == endNode return GraphPath.makeFromPredecessors[previousNodes, startNode, endNode, distance@endNode] // Find the shortest already-found distance from startNode to currNode currDist = distance@currNode for outEdge = outEdges@currNode { // The prospective node we're examining nextNode = outEdge.getOutNode[] // The weight from currNode to nextNode weight = outEdge.getWeight[] // The total distance from startNode to the next node potentialDist = currDist + weight // Look up shortest distance fron start to nextNode, if it exists nextDist = distance@nextNode // Is the other node unchecked (which means nextDist should be // treated as infinite) or this new distance shorter? if nextDist == undef or potentialDist < nextDist { // Update the shortest-possible distance to nextNode distance@nextNode = potentialDist previousNodes@nextNode = currNode fScore@nextNode = potentialDist + heuristicFunction[nextNode, data] shortestList.insert[ nextNode ] } } } // If we reached here, destination node was not found return undef } /** Find a minimal spanning tree using Prim's algorithm. This is very similar to Dijkstra's algorithm. Only a couple of lines are changed. The result is a new Graph representing the minimum spanning tree. If the directed flag is true [default is false], only directed edges will be created, emanating from startNode. Otherwise, all edges will be bidirectional. This *will* work to find the maximum spanning tree by negating all of the weights. Most graph algorithms do not have this property. This can find the minimum product spanning tree, which is where we find the tree that represents the minimal *product* of weights. To do this, take the logarithm of all the weights. See Skiena section 6.1.1 and 6.3.1 */ minimumSpanningTree[startNode, directed=false] := { // Sanity check to ensure that the graph actually *contains* the start // and end node. if ! nodes.contains[startNode] { println["Graph.minimumSpanningTree: Graph does not contain start node " + startNode] return undef } // TODO: Handle degenerate cases with zero one node (copy this graph.) /** A set of nodes that have been visited. */ visited = new set /** A dictionary of distances from node1 to the specified node. The key is the specified node and the value is the distance to that node. */ distance = new dict /** This is an OrderedList list of the nodes that have already been processed and their distances from the start node. It is kept in order with the smallest distances at the end so they will get popped efficiently. Each element is an array [node, dist] pair. TODO: Implement a heap (perhaps a Fibonacci heap or binary heap) for faster insertion? */ shortestList = new OrderedList[{|a,b| b@1 <=> a@1}] /** This is a dictionary of previous nodes for each node. The key is the node and the value is the previous node that has the shortest path to this node. This is eventually used to find the shortest path by enumerating (backwards) from endNode to startNode. */ previousNodes = new dict // Initialize the distance to the first node to 0. distance@startNode = 0 // The current node currNode = startNode VISITED: while ! visited.contains[currNode] // While current node is not visited { visited.put[currNode] // Mark the current node as visited // Find the shortest already-found distance from startNode to currNode currDist = distance@currNode for outEdge = outEdges@currNode { // The prospective node we're examining nextNode = outEdge.getOutNode[] // The weight from currNode to nextNode weight = outEdge.getWeight[] // Look up shortest distance fron start to nextNode, if it exists nextDist = distance@nextNode // Is the other node unchecked (which means nextDist should be // treated as infinite) or this new distance shorter? if (nextDist == undef or weight < nextDist) and (! (visited.contains[nextNode])) { // Update the shortest-possible distance to nextNode distance@nextNode = weight previousNodes@nextNode = currNode shortestList.insert[ [nextNode, weight] ] } } // Pop the node with the next shortest distance, if there is one. // If there is not one, the algorithm SHOULD terminate at the top // while loop. if ! shortestList.isEmpty[] currNode = (shortestList.pop[])@0 } // Make a new Graph based on previousNodes return makeFromPredecessors[previousNodes, directed] } /** Find a minimal spanning forest using Kruskal's algorithm. It assumes that all edges are bidirectional. The result is a new Graph representing the minimum spanning forest. See Skiena section 6.1.2 and 6.1.3 THINK ABOUT: Some of the nodes from the original graph might be disconnected from this graph (or from each other.) Add them to the result as disconnected nodes? */ minimumSpanningForest[] := { union = new GraphUnionOfSets[this] edges = sort[getAllEdgesWithFlags[], byColumn[2]] // Sort by weight edgeCount = length[edges] result = new Graph for [n1, n2, weight] = edges { g1 = union.get[n1] g2 = union.get[n2] if ! union.inSameComponent[g1, g2] { // println[g1.node + " and " + g2.node + " in MST"] result.addBidiEdge[g1.node, g2.node, getWeight[g1.node, g2.node]] union.unionOfSets[g1, g2] } } // TODO: Look for unconnected nodes and add those to the forest. return result } /** Makes a new Graph from this Graph and a list of predecessors. This is used to represent the result of spanning-tree algorithms. The return value is a new Graph. */ makeFromPredecessors[previousNodes, directed=false] := { ret = new Graph for [node, pred] = previousNodes { weight = getWeight[pred, node] if directed ret.addDirectedEdge[pred, node, weight] else ret.addBidiEdge[pred, node, weight] } return ret } /** Creates a dot file of this graph that can be plotted or manipulated with the Graphviz set of tools. See https://www.graphviz.org/doc/info/lang.html params: [directed, extraData] * directed is a boolean flag which controls if edges are rendered as a directed or undirected (bidirectional.) If directed is true, this will draw directed edges. If directed is false, this will render undirected edges. * extraData is .dot code that will be placed directly into the file. For example, """fontname="Helvetica,Arial,sans-serif" node [fontname="Helvetica,Arial,sans-serif"] edge [fontname="Helvetica,Arial,sans-serif"]""" This can be written to a file with something like: w = new Writer["graph6.dot"] w.print[g.toDotFile[false]] w.close[] and then rendered to the desired format with a command-line like: dot -Tsvg graph6.dot > graph6.svg where "dot" can be another layout program like "neato" or "circo" or "twopi". See https://www.graphviz.org/docs/layouts/ */ toDotFile[directed, extraData=""] := { if directed res = "digraph" else res = "graph" res = res + "\n{\n" if extraData != "" res = res + extraData + "\n" // Render all edges for [node1, node2, weight, bidi] = getAllEdgesWithFlags[] { res = res + " \"$node1\" " + (directed ? "->" : "--") res = res + " \"$node2\"\n" } res = res + "}" return res } /** This performs a breadth-first traversal of a Graph. It takes a start node and an object that implements the GraphWatcher interface (defined below) which is notified when nodes and edges are visited. See Skiena section 5.6 */ breadthFirstTraverse[startNode, watcher is GraphWatcher] := { // Sanity check to ensure that the graph actually *contains* the start // and end node. if ! nodes.contains[startNode] { println["Graph.breadthFirstTraverse: Graph does not contain start node " + startNode] return undef } // A queue of nodes to visit. queue = new array // A dictionary containing the states of the nodes. undef should be // treated as STATE_UNDISCOVERED states = new dict queue.push[startNode] states@startNode = STATE_DISCOVERED /** This is a dictionary of previous nodes for each node. The key is the node and the value is the previous node that has the shortest path to this node. This is eventually used to find the shortest path by enumerating (backwards) from endNode to startNode. */ previousNodes = new dict while ! queue.isEmpty[] { currNode = queue.popFirst[] // do process node early behavior here watcher.nodeVisited[this, currNode] states@currNode = STATE_PROCESSED for outEdge = outEdges@currNode { nextNode = outEdge.getOutNode[] nextState = states.get[nextNode, STATE_UNDISCOVERED] if nextState < STATE_PROCESSED { // Do process edge behavior here watcher.edgeTraversed[this, currNode, nextNode, outEdge] } if nextState < STATE_DISCOVERED { queue.push[nextNode] states@nextNode = STATE_DISCOVERED previousNodes@nextNode = currNode } } // Do process node late behavior here. watcher.nodeVisitedLate[this, currNode] } // TODO: What should this return? return previousNodes } /** This performs a depth-first traversal of a Graph. It takes a start node and an object that implements the GraphWatcher interface (defined below) which is notified when nodes and edges are visited. This is a modification of Skiena section 5.8 to not be recursive and to fix apparent bugs in Skiena (e.g. nodes visited multiple times.) */ depthFirstTraverse[startNode, watcher is GraphWatcher] := { // Sanity check to ensure that the graph actually *contains* the start // and end node. if ! nodes.contains[startNode] { println["Graph.depthFirstTraverse: Graph does not contain start node " + startNode] return undef } // A stack of nodes to visit. stack = new array // A dictionary containing the states of the nodes. undef should be // treated as STATE_UNDISCOVERED states = new dict stack.push[startNode] /** This is a dictionary of previous nodes for each node. The key is the node and the value is the previous node that has the shortest path to this node. This is eventually used to find the shortest path by enumerating (backwards) from endNode to startNode. */ previousNodes = new dict while ! stack.isEmpty[] { currNode = stack.pop[] if states.get[currNode, STATE_UNDISCOVERED] < STATE_DISCOVERED { states@currNode = STATE_DISCOVERED // do process node early behavior here watcher.nodeVisited[this, currNode] for outEdge = outEdges@currNode { nextNode = outEdge.getOutNode[] nextState = states.get[nextNode, STATE_UNDISCOVERED] if nextState < STATE_DISCOVERED { previousNodes@nextNode = currNode // Do process edge behavior here watcher.edgeTraversed[this, currNode, nextNode, outEdge] stack.push[nextNode] } else if nextState < STATE_PROCESSED watcher.edgeTraversed[this, currNode, nextNode, outEdge] } states@currNode = STATE_PROCESSED // Do process node late behavior here. watcher.nodeVisitedLate[this, currNode] } } // TODO: What should this return? return previousNodes } /** Performs a topological sort using Kahn's algorithm. A topological sort returns an array of nodes such that for every directed edge u->v, node u comes before v in the ordering. This returns only one possible topological sort (there may be many.) This will return undef if the graph is not a directed acyclic graph. You can also think of this as a temporal ordering. If node u must be performed before node v, this makes sure that u occurs before v in the ordering. See: https://www.geeksforgeeks.org/topological-sorting-indegree-based-solution/ https://en.wikipedia.org/wiki/Topological_sorting */ topologicalSort[] := { indeg = indegrees[] // Dict of queue = new array // Make a queue and enqueue all nodes with indegree 0 for [node, deg] = indeg if deg == 0 queue.push[node] visited = 0 result = new array while ! queue.isEmpty[] { node = queue.pop[] result.push[node] // Iterate through all of the node's neighboring nodes and decrease // their indegree by 1 for outEdge = getOutEdges[node] { outnode = outEdge.getOutNode[] // Decrement the count. If indegree becomes 0, add it to queue indeg.increment[outnode, -1] if indeg@outnode == 0 queue.push[outnode] } visited = visited + 1 } if (visited != getNodeCount[]) { println["Graph.topologicalSort: Cycle detected in graph. visited was $visited, nodeCount was " + getNodeCount[] + " result would be " + result]; return undef; } return result } /** This returns an object (TopologicalSorter) that can be used to iterate through all possible topological sorts of this Graph. It can be used as the following: results = g.allTopologicalSorts[] while r = results.nextResult[] println[r] where r is an array of nodes in topological sort order. See: https://www.geeksforgeeks.org/all-topological-sorts-of-a-directed-acyclic-graph/ */ allTopologicalSorts[] := { return new TopologicalSorter[this] } /** Calculate the indegrees (that is, the number of edges pointing into each node.) This returns a dictionary of . */ indegrees[] := { indeg = new dict for node = nodes indeg@node = 0 for [outNode, edges] = outEdges for edge = edges indeg.increment[edge.getOutNode[]] return indeg } } /** This represents a directed edge in a graph and any associated data, including its weight. More data may eventually be added, such as an edge identifier, color, or whatever. */ class GraphEdge { /** A node that this is connected to. */ var outNode /** The weight of the node. This should be numeric. */ var weight /** A boolean flag indicating if this edge is bidirectional. */ var isBidi /** Create a new GraphEdge to node with the specified weight. */ new[node, weight, isBidirectional] := { outNode = node this.weight = weight isBidi = isBidirectional } /** Returns the out node of this edge. */ getOutNode[] := { return outNode } /** Returns the weight of this edge. */ getWeight[] := { return weight } /** Returns true if the edge is bidirectional, false if it is directed. */ isBidi[] := { return isBidi } /** Returns true if the edge is directed, false if it is bidirectional. */ isDirected[] := { return ! isBidi } } /** This represents a path through a graph. It contains a list of the nodes that make up the path and the total cost to follow the path. */ class GraphPath { /** An array of nodes between the nodes. */ var path = new array /** The distance between the start and end node. */ var distance /** Appends a node to the path. */ appendNode[node] := { path.push[node] } /** Prepends a node to the path. */ prependNode[node] := { path.pushFirst[node] } /** Gets the path as an array of nodes. */ getNodes[] := { return path } /** Gets the path as an array of nodes. */ getPath[] := { return path } /** Sets the total distance of the path */ setDistance[dist] := { distance = dist } /** Gets the total distance of the path. */ getDistance[] := { return distance } /** Makes a dengenerate path where the start and end node are the same and distance is zero. */ class makeDegeneratePath[node] := { gp = new GraphPath gp.setDistance[0] gp.appendNode[node] return gp } /** Makes a GraphPath from a predecessor dictionary going from startNode to endNode with the specified distance. */ class makeFromPredecessors[previousNodes, startNode, endNode, dist] := { gp = new GraphPath gp.doMakeFromPredecessors[previousNodes, startNode, endNode] gp.setDistance[dist] return gp } /** Makes a GraphPath from a predecessor dictionary going from startNode to endNode with no specified total distance. The distance will be created from the length of the node list. */ class makeFromPredecessors[previousNodes, startNode, endNode] := { gp = new GraphPath gp.doMakeFromPredecessors[previousNodes, startNode, endNode] gp.setDistance[length[gp.path] - 1] return gp } /** Internal function which performs the creation of the path from a predecessor list. Does not set the distance. THINK ABOUT: Instead of prepending nodes to an array, build an array or a stack and reverse it which will be slightly faster. */ doMakeFromPredecessors[previousNodes, startNode, endNode] := { currNode = endNode do { prependNode[currNode] currNode = previousNodes@currNode } while currNode != startNode prependNode[startNode] } } /** This class contains a set of sets of predecessor nodes. This is used in Kruskal's algorithm. See Skiena 6.1.3. Also, for other algorithms, see: https://en.wikipedia.org/wiki/Disjoint-set_data_structure This is implemented quite differently, though, as a dict. */ class GraphUnionOfSets { /** An dict of GraphSingleSets */ var dictOfSets /** Construct a union of sets from the specified graph. */ new[g is Graph] := { /** Initialize what is called the "union-set" in a lot of documentation. this is actually a dictionary of GraphSingleSet objects. */ nodes = g.getNodes[] dictOfSets = new dict for node = nodes dictOfSets@node = new GraphSingleSet[node] } /** Returns the GraphSingleSet corresponding to the specified node. */ get[node] := { return dictOfSets@node } /** Find the root of a predecessor list. It returns a GraphSingleSet. This is used in Kruskal's algorithm. In the typical documentation of Kruskal's algorithm, this is just called "find". There are other algorithms that modify the previousNodes list in-place to optimize subsequent lookups but this is not one of them. For those, see https://en.wikipedia.org/wiki/Disjoint-set_data_structure and Skiena section 6.1.3 */ findRoot[fromNode is GraphSingleSet] := { // print["Evaluating " + fromNode.node + "..."] root = fromNode parent = root.parent while parent != root { root = parent parent = root.parent } // println["root is " + root.node] return root } /** Makes a "union of sets" of two separate trees. This is often called "union-sets" in the Graph documentation. */ unionOfSets[node1 is GraphSingleSet, node2 is GraphSingleSet] := { r1 = findRoot[node1] r2 = findRoot[node2] if r1 == r2 // Already in the same connected graph. return /* Now attach the shorter set as a child of the larger set. */ if r1.size >= r2.size { r1.size = r1.size + r2.size r2.parent = r1 } else { r2.size = r1.size + r2.size r1.parent = r2 } } /** Returns true if the two nodes are in the same connected graph. This is often called "same-component" in the literature. */ inSameComponent[node1 is GraphSingleSet, node2 is GraphSingleSet] := { return findRoot[node1] == findRoot[node2] } } /** This is a helper data structure used by GraphUnionOfSets and Kruskal's algorithm. It represents aspects of a connected Graph that might be disconnected from other Graphs. See: https://en.wikipedia.org/wiki/Disjoint-set_data_structure */ class GraphSingleSet { /** The node represented by this set. */ var node /** The parent GraphSingleSet of this node. */ var parent /** The size of this whole tree. */ var size /** Construct a new GraphSingleSet for the specified node. */ new[node] := { this.node = node parent = this size = 1 } } /** This is a public interface for a set of functions that must be implemented by a class that wants to observe the traversal of a graph (say, breadth- first, depth-first, or, for a tree, pre-, post-, or in-order search.) It contains methods that will be called when a node about to be visited, when a node has just been visited, (you probably don't need to implement a handler for both of these) and when an edge is traversed. */ interface GraphWatcher { /** This method is called during the traverse of a graph as a node is about to be visited. You probably want to implement this for most graph traversals. This is called process_vertex_early in Skiena. */ nodeVisited[g is Graph, node] /** This method is called during the traverse of a graph after a node has just been visited. The structure of the Graph object may have been changed as a result. You probably do not need to implement this method to do anything. This is called process_vertex_late in Skiena. */ nodeVisitedLate[g is Graph, node] /** This method is called as an edge is about to be traversed. You may not need to implement this to do anything unless the edge has special properties (e.g. weights that you need to sum.) */ edgeTraversed[g is Graph, fromNode, toNode, edge is GraphEdge] } /** This is a sample class that logs the nodes processed by a traversal algorithm. It implements the GraphWatcher interface. */ class GraphNodeLogger implements GraphWatcher { /** This method is called during the traverse of a graph as a node is about to be visited. You probably want to implement this for most graph traversals. This is called process_vertex_early in Skiena. */ nodeVisited[g is Graph, node] := { println["Visited $node"] } /** This method is called during the traverse of a graph after a node has just been visited. The structure of the Graph object may have been changed as a result. You probably do not need to implement this method to do anything. This is called process_vertex_late in Skiena. */ nodeVisitedLate[g is Graph, node] := { } /** This method is called as an edge is about to be traversed. You may not need to implement this to do anything unless the edge has special properties (e.g. weights that you need to sum.) */ edgeTraversed[g is Graph, fromNode, toNode, edge is GraphEdge] := { } } /** This is a sample class that logs the edges traversed by a traversal algorithm. It implements the GraphWatcher interface. */ class GraphEdgeLogger implements GraphWatcher { /** This method is called during the traverse of a graph as a node is about to be visited. You probably want to implement this for most graph traversals. This is called process_vertex_early in Skiena. */ nodeVisited[g is Graph, node] := { } /** This method is called during the traverse of a graph after a node has just been visited. The structure of the Graph object may have been changed as a result. You probably do not need to implement this method to do anything. This is called process_vertex_late in Skiena. */ nodeVisitedLate[g is Graph, node] := { } /** This method is called as an edge is about to be traversed. You may not need to implement this to do anything unless the edge has special properties (e.g. weights that you need to sum.) */ edgeTraversed[g is Graph, fromNode, toNode, edge is GraphEdge] := { println["$fromNode -> $toNode"] } } /** This class keeps track of all distances between every pair of nodes in a graph and has accessor methods. It is generated by Floyd's algorithm in Graph.allShortestDistances[]. */ class GraphAllDistances { /** A dict from index number to node. */ var indexToNode /** A dict from node to index number. */ var nodeToIndex /** A adjacency matrix as a Java int[][]. */ var adj /** The int value that indicates that nodes are not connected. */ var disconnectedValue = -1 /** A 2-d int adjacency matrix. */ new[indexToNode, nodeToIndex, adj] := { this.indexToNode = indexToNode this.nodeToIndex = nodeToIndex this.adj = adj } /** Get the distance between node1 and node2. Returns undef if they are not connected. */ getDistance[node1, node2] := { if node1 == node2 return 0 i1 = nodeToIndex@node1 i2 = nodeToIndex@node2 if i1 != undef and i2 != undef return adj@i1@i2 else { println["GraphAllDistances.getDistance: Nodes not in graph! [$node1, $node2]"] return undef } } /** This returns a dictionary where the key is an array [node1,node2] and the value is the weight. */ toDict[] := { size = length[adj] dist = new dict for row = 0 to size-1 { rnode = indexToNode@row dist@([rnode,rnode]) = 0 // Add trivial 0 distance to self. COL: for col = 0 to size-1 { if col == row next COL; cnode = indexToNode@col val = adj@row@col if val != disconnectedValue dist@([rnode,cnode]) = val } } return dist } } /** This class is used to produce all topological sorts of a given graph. You can use it by calling the following methods on a Graph g. results = g.allTopologicalSorts[] while r = results.nextResult[] println[r] */ class TopologicalSorter { /** A set of results that have been solved. This is an array of nodes. */ var results = new array /** Remaining states as a stack of TopologicalSortState. */ var states = new array /** The graph. */ var graph /** Construct a new TopologicalSorter that wraps the specified Graph. */ new[g is Graph] := { graph = g indegrees = graph.indegrees[] addState[new TopologicalSortState[new set, new array, indegrees, this]] } /** Returns the next result, or undef if no more exist. */ nextResult[] := { if ! results.isEmpty[] return results.pop[] while ! states.isEmpty[] { doSort[states.pop[]] if ! results.isEmpty[] return results.pop[] } return undef } /** Internal function to do one iteration of topological sort. */ doSort[state is TopologicalSortState] := { done = false for node = graph.getNodes[] { if ((! (state.visited.contains[node])) and (state.indeg)@node == 0) { i1 = deepCopy[state.indeg] for outEdge = graph.getOutEdges[node] i1.increment[outEdge.getOutNode[], -1] newVis = state.visited.shallowCopy[] newVis.put[node] newRes = state.result.shallowCopy[] newRes.push[node] // println["state is " + state] addState[new TopologicalSortState[newVis, newRes, i1, this]] done = true } } if !done addResult[state.result] } /** Adds a new result, which is an array of nodes. */ addResult[result] := { results.push[result] } /** Adds a new state. */ addState[state is TopologicalSortState] := { states.push[state] } } /** This class encapsulates the state of a TopologicalSorter. */ class TopologicalSortState { var visited var result var indeg var sorter new[v, r, i, s is TopologicalSorter] := { visited = v result = r indeg = i sorter = s } }