Graph
Graph Representation
Graph Problems and Algorithms
In addition, we also have planar graphs, dynamic graphs, etc.
Traversal-based Algorithms Review
Breadth-first Search (BFS)
Depth-first Search (DFS)
Topological Sort
Graph Connectivity
for each vertex, if not visited, run BFS/DFS on it
Algorithms based on DFS
Bi-connectivity, articulation points, bridges (CLRS pp. 621-622)
Topological sort (CLRS pp. 612-614)
Strongly connected components (CLRS pp. 615-618)
Minimum Spanning Tree (MST)
undirected weighted graph
find n-1 edges that connected all vertices with minimum total weights
Light-edge property
The lightest edge in a cut must be in the MST
Prim's Algorithm
start from an arbitrary vertex, add the lightest connected edge to the tree, and repeat
this is a greedy algorithm
implementation can be array-based, binary heap, or Fibonacci heap (CLRS Section 19)
Kruskal’s Algorithm
scan edges by weight from lowest to highest, add the edge that doesn't introduce cycle, and repeat
this is a greedy algorithm
data structure: weighted edge list O(m log m) = O(m log n)
Single-Source Shortest Paths (SSSP)
undirected graph, or directed acyclic graph (DAG)
find a shortest path from a given source vertex u to each vertex v
Bellman-Ford Algorithm
dynamic programming algorithm O(nm)
can work for negative weight
Dijkstra's Algorithm
start from an arbitrary node, expand the neighbor with the lowest cost, and repeat
greedy algorithm (similar to Prim's)
for positive weight only
implementation: array-based O(n^2), binary heap O((m+n) log n), or Fibonacci heap (m + n log n)
All-Pairs Shortest Paths (APSP)
find a shortest path from u to v for every pair of vertices u and v
although we can solve this by running a single-source algorithm once from each vertex, we usually can solve it faster (e.g., by Floyd-Warshall algorithm)
Strongly Connected Components (SCC)
directed graph
strongly connected: if every vertex is reachable from every other vertex
find a partition such that each subgraph is strongly connected
Tarjan’s algorithm
is a DFS-based linear-time algorithm O(n + m)
kind of simple to implement, but counter-intuitive
hard to parallel
DFS is P-Complete (DFS is inherently sequential [Reif 1985])
Reachability-based algorithm
find SCC by reachability queries O(m log n)
what is the set of vertices that can be reached by a given vertex v
which vertices can reach v
the intersection of two sets form an SCC
can be parallelized (BGSS 2016, Yan Gu, code)
Bipartite Graph Matching
bipartite graph: vertex set V can be partitioned into two subsets V1 and V2 such that each edge has one endpoint in V1 and the other endpoint in V2
matching: a subset of the edges for which every vertex belongs to at most one of the edges
maximum matching: a matching with the maximum number of edges
perfect matching: a matching in which every vertex is matched
augmenting path
a structure such that reversing path direction can get one more matched edge
unmatched vertex --- matched vertex --- ... --- matched vertex --- unmatched vertex
how to find:
reverse the matched edges
start from an unmatched vertex, and try to find a path to another unmatched vertex
Hungarian algorithm
find the maximum matching for bipartite graphs in O(nm) time
find an augmenting path, reverse direction, get one more matched edge, and repeat
Maximum Flow (Flow Network)
directed graph, with non-negative weight/capacity
find the max-flow from the source s to the sink t
Ford-Fulkerson method
greedy algorithm
the notion of augmenting path
here we overload the notion of augmenting path on a general directed graph (instead of bipartite)
it is called augmenting path because once found, we can get additional flow from source to sink
method: keep finding augmenting paths in the residual graph
start with finding an arbitrary path from s to t
reverse it, resulting in a residual graph
find a new path from s to t on the residual graph, and reverse it to generate a new residual graph
repeat until no paths can be found, then the sum of capacities of all found paths is the max-flow
It is called "method" but not "algorithm" because how to find augmenting path is missing
depending on the actual method to find augmenting path, the algorithm efficiency can differ
Ways to find augmenting paths in a general directed graph
Dinic's Algorithm
new concepts: level graph
construct level graph: BFS from source to every node and compute hops
then find augmenting paths on level graphs
proposed by Dinic [1970]; fastest in practice O(n^2 m)
on bipartite graph, this algorithm is referred to as Hopcroft-Karp algorithm [1973]
Related Discussions on Flow Network
Min-cost Flow Problem
in a flow network, each edge also has a cost
the goal is to maximizing the flow while minimizing the cost
Relationship to Linear Programming
most combinatorial algorithms can be represented in a LP form (and solved by LP solver efficiently)
we can convert a network flow problem into a LP problem
by formulating edges into linear constraints
we can also convert a LP problem into a network flow problem
by adding a virtual node, for example
the flow property can promise the equality constraints
The dual of max matching: min vertex cover
vertex cover: a set of vertices that includes at least one endpoint of every edge
in any bipartite graph, maximum matching = minimum vertex cover [Konig and Egercary 1931]
The dual of max flow: min cut
cut: a subset of edges that disconnects the source and the sink
minimum cut: the cut with minimum sum of edge capacities
in a flow network, maximum flow = minimum cut [Menger 1927]
Summary
MST algorithms: Prim’s, Kruskal’s
SSSP algorithms: Dijkstra’s, Bellman-Ford
SCC algorithms: Tarjan’s algorithm, BGSS algorithm
Bipartite graph matching: Hungarian algorithm
Max flow: Ford-Fulkerson method
Example Code: Graph and BFS
Last updated