About Graph
How to represent Graph in Python
Few programming languages provide direct support for graphs as a data type, and Python is no exception. However, graphs are easily built out of lists and dictionaries. For instance, here's a simple graph (I can't use drawings in these columns, so I write down the graph's arcs):
graph = {'A': ['B', 'C'],
'B': ['C', 'D'],
'C': ['D'],
'D': ['C'],
'E': ['F'],
'F': ['C']}
This is a dictionary whose keys are the nodes of the graph. For each key, the corresponding value is a list containing the nodes that are connected by a direct arc from this node. This is about as simple as it gets (even simpler, the nodes could be represented by numbers instead of names, but names are more convenient and can easily be made to carry more information, such as city names).
Let's write a simple function to determine a path between two nodes. It takes a graph and the start and end nodes as arguments. It will return a list of nodes (including the start and end nodes) comprising the path. When no path can be found, it returns None. The same node will not occur more than once on the path returned (i.e. it won't contain cycles). The algorithm uses an important technique called backtracking: it tries each possibility in turn until it finds a solution.
def find_path(graph, start, end, path=[]):
path = path + [start]
if start == end:
return path
if not graph.has_key(start):
return None
for node in graph[start]:
if node not in path:
newpath = find_path(graph, node, end, path)
if newpath: return newpath
return None
A simple Python Graph Implementation:
class Graph:
def __init__(self):
self.nodes = set()
self.edges = defaultdict(list)
self.distances = {}
def add_node(self, value):
self.nodes.add(value)
def add_edge(self, from_node, to_node, distance):
self.edges[from_node].append(to_node)
self.edges[to_node].append(from_node)
self.distances[(from_node, to_node)] = distance
General Graph Traversal Algorithm
The most important difference between recursive and iterative style traversal is that, for recursive traversal, while test on node (or vertex); while for iterative traversal, while test on stack/queue.
-
For graph, there are:
- iterative DFS/BFS, supported by stack/queue
- recursive DFS
-
For binary tree, there are:
- recursive DFS (including in-order, pre-order, post-order)
- iterative BFS, no need support from queue
- iterative DFS, need stack ???
- this iterative DFS's order is not necessary any of the in-order, pre-order and post-order
Depth-First Search and Breadth-First Search in Python
Example graph representation in Python (which is used in the following example):
graph = {'A': set(['B', 'C']),
'B': set(['A', 'D', 'E']),
'C': set(['A', 'F']),
'D': set(['B']),
'E': set(['B', 'F']),
'F': set(['C', 'E'])}
Iterative traversal:
Difference between tree and graph iterative traversal:
- Tree don't have cycle, so in BFS, we don't need to keep track which node is visited (but stack/queue is still needed) -- can we do tree iterative without using any additional data structure ???
- Tree have node with empty child, we need to check None condition when we add child into stack/queue
General graph BFS
6 Steps:
def bfs(graph, start):
visited, queue = set(), [start] # s1. If we want traversal order, can change this visited from set to list
# (so visited has two functions: keep track of visited vertices, and keep track of visiting order)
while queue: #s2
vertex = queue.pop(0) # s3, pop first one, i.e.queue (only difference btw BFS and DFS)
# at each while iteration, we only process one vertex (node) (this vertex)
if vertex not in visited: #s4
print vertex # just do some operation
visited.add(vertex) #s5, processed.
queue.extend(graph[vertex] - visited) # s6, set operation
# add all connected vertices into queue for future process
# if tree, we need to check if child is None
# this step is different for different Graph representation
return visited (maybe use a loop to add vertices)
bfs(graph, 'A') # {'B', 'C', 'A', 'F', 'D', 'E'}
General tree BFS
# not 100% sure if this is right
# the difference btw tree and graph is that tree don't have circle
# therefore we don't need to have visited set to keep track of visited nodes
def bfs(root):
if not root:
return
queue = [root] # s1. we can also have visited=[] to keep track of visiting order, but this is not necessary.
while queue: #s2
node = queue.pop(0) #s3
print node
if node.left: queue.append(node.left) #s4
if node.right: queue.append(node.right)
General graph DFS
# with stack:
def dfs(graph, start):
visited, stack = set(), [start]
while stack:
vertex = stack.pop() # pop last one, i.e. stack
if vertex not in visited:
visited.add(vertex)
stack.extend(graph[vertex] - visited)
return visited
dfs(graph, 'A') # {'E', 'D', 'F', 'A', 'C', 'B'}
General tree DFS:
def dfs(root):
if not root:
return
stack = [root]
while stack:
node = stack.pop()
print node.val
if node.right: stack.append(node.right)
# print right first, we get the same order as in-order. Otherwise we don't get any order. But anyways, still DFS
if node.left: stack.append(node.left)
Recursive traversal:
- the difference between tree and graph in terms of recursive traversal is,
General Graph DFS with recursion:
# using recursion:
# remember this way of defining function !!!!
def dfs(graph, start, visited=None):
# !!! trick: define visited=None, so we don't need to define an extra helper_dfs function !
if visited is None:
visited = set()
visited.add(start)
for next in graph[start] - visited: # process one vertex
# Only process these non-visited vertices.
# with other data structure, need to check if this vertex's neighborhoods are visited or not.
dfs(graph, next, visited)
return visited
dfs(graph, 'C') # {'E', 'D', 'F', 'A', 'C', 'B'}
General Tree DFS with recursion:
This is just in-order, pre-order, post-order tree traversal
pre-order:
def preorder(tree):
if tree:
print(tree.getRootVal())
preorder(tree.getLeftChild())
preorder(tree.getRightChild())
post-order
def postorder(tree):
if tree != None:
postorder(tree.getLeftChild())
postorder(tree.getRightChild())
print(tree.getRootVal())
in-order:
If we use DFS(stack), and first push right, then push left, we also get in-order. But we cannot get pre-order and post-order.
def inorder(tree):
if tree != None:
inorder(tree.getLeftChild())
print(tree.getRootVal())
inorder(tree.getRightChild())
We are able to tweak both of the previous implementations to return all possible paths between a start and goal vertex.
def dfs_paths(graph, start, goal):
stack = [(start, [start])]
while stack:
(vertex, path) = stack.pop()
for next in graph[vertex] - set(path):
if next == goal:
yield path + [next]
else:
stack.append((next, path + [next]))
list(dfs_paths(graph, 'A', 'F')) # [['A', 'C', 'F'], ['A', 'B', 'E', 'F']]
Tree specific BFS
I don't think the following code is necessary, no idea why need to track level
Level Order Tree Traversal
Without using helper data structure
// sudo code:
printLevelorder(tree)
for d = 1 to height(tree)
printGivenLevel(tree, d);
printGivenLevel(tree, level)
if tree is NULL then return;
if level is 1,
then print(tree->data);
else if level greater than 1,
then printGivenLevel(tree->left, level-1);
printGivenLevel(tree->right, level-1);
# python code
# Function to print level order traversal of tree
def printLevelOrder(root):
h = height(root)
for i in range(1, h+1):
printGivenLevel(root, i)
# Print nodes at a given level
def printGivenLevel(root , level):
if root is None:
return
if level == 1:
print "%d" %(root.data),
elif level > 1 :
printGivenLevel(root.left , level-1)
printGivenLevel(root.right , level-1)
def height(node):
if node is None:
return 0
else :
# Compute the height of each subtree
lheight = height(node.left)
rheight = height(node.right)
#Use the larger one
if lheight > rheight :
return lheight+1
else:
return rheight+1