Comment on page

Graph

Graph Representation

Graph Problems and Algorithms

In addition, we also have planar graphs, dynamic graphs, etc.

Traversal-based Algorithms Review

  • Breadth-first Search (BFS)
  • Depth-first Search (DFS)
  • Topological Sort
  • Graph Connectivity
    • for each vertex, if not visited, run BFS/DFS on it
  • Algorithms based on DFS
    • Bi-connectivity, articulation points, bridges (CLRS pp. 621-622)
    • Topological sort (CLRS pp. 612-614)
    • Strongly connected components (CLRS pp. 615-618)

Minimum Spanning Tree (MST)

  • undirected weighted graph
  • find n-1 edges that connected all vertices with minimum total weights

Light-edge property

  • The lightest edge in a cut must be in the MST

Prim's Algorithm

  • start from an arbitrary vertex, add the lightest connected edge to the tree, and repeat
  • this is a greedy algorithm
  • implementation can be array-based, binary heap, or Fibonacci heap (CLRS Section 19)

Kruskal’s Algorithm

  • scan edges by weight from lowest to highest, add the edge that doesn't introduce cycle, and repeat
  • this is a greedy algorithm
  • data structure: weighted edge list O(m log m) = O(m log n)
// linked forest + path compression
find(x) { return label[x] = x==label[x] ? x : find(label[x]); }
union(x, y) { if (find(x) != find(y)) label[label[y]] = label[x]; }
test(x, y) { return (find(x) == find(y)); }

Single-Source Shortest Paths (SSSP)

  • undirected graph, or directed acyclic graph (DAG)
  • find a shortest path from a given source vertex u to each vertex v

Bellman-Ford Algorithm

  • dynamic programming algorithm O(nm)
  • can work for negative weight
  • Di,k=min{Di,k1min(j,i)E{Dj,k1+w(j,i)}D_{i,k} = \min \begin{cases} D_{i, k-1} \\ \min_{(j, i)\in E} \{D_{j, k-1} + w(j, i)\}\end{cases}
// Bellman-Ford Algorithm
for i=1 to n do
D[i]=MAX
D[s]=0
for k=1 to n-1 do
for each (i,j) in E do
if (D[i]+w(i,j)<D[j])
D[j]=D[i]+w(i,j), from[j]=i
// optional optimization: if no distance is updated, break

Dijkstra's Algorithm

  • start from an arbitrary node, expand the neighbor with the lowest cost, and repeat
  • greedy algorithm (similar to Prim's)
  • for positive weight only
  • implementation: array-based O(n^2), binary heap O((m+n) log n), or Fibonacci heap (m + n log n)

All-Pairs Shortest Paths (APSP)

  • find a shortest path from u to v for every pair of vertices u and v
  • although we can solve this by running a single-source algorithm once from each vertex, we usually can solve it faster (e.g., by Floyd-Warshall algorithm)
Relax operation: update (u, v) path by weight w if better (CLRS pp. 648)

Strongly Connected Components (SCC)

  • directed graph
  • strongly connected: if every vertex is reachable from every other vertex
  • find a partition such that each subgraph is strongly connected

Tarjan’s algorithm

  • is a DFS-based linear-time algorithm O(n + m)
  • kind of simple to implement, but counter-intuitive
  • hard to parallel
    • DFS is P-Complete (DFS is inherently sequential [Reif 1985])

Reachability-based algorithm

  • find SCC by reachability queries O(m log n)
    • what is the set of vertices that can be reached by a given vertex v
    • which vertices can reach v
    • the intersection of two sets form an SCC
  • can be parallelized (BGSS 2016, Yan Gu, code)

Bipartite Graph Matching

  • bipartite graph: vertex set V can be partitioned into two subsets V1 and V2 such that each edge has one endpoint in V1 and the other endpoint in V2
  • matching: a subset of the edges for which every vertex belongs to at most one of the edges
  • maximum matching: a matching with the maximum number of edges
  • perfect matching: a matching in which every vertex is matched

augmenting path

  • a structure such that reversing path direction can get one more matched edge
    • unmatched vertex --- matched vertex --- ... --- matched vertex --- unmatched vertex
  • how to find:
    • reverse the matched edges
    • start from an unmatched vertex, and try to find a path to another unmatched vertex

Hungarian algorithm

  • find the maximum matching for bipartite graphs in O(nm) time
  • find an augmenting path, reverse direction, get one more matched edge, and repeat
// a DFS-based implementation
// g[i][j] is the graph, p[j]=i stores the matched edge,
// and b[j] is a temp boolean array
bool find(i)
for j=1 to m do
if g[i][j] == 1 and b[j] == false
b[j] = true
if p[j] == -1 or find(p[j])
p[j]=i, return true
return false
Hungarian()
p[1..n] = -1
for i=1 to n do
b[1..n] = false, find(i)

Maximum Flow (Flow Network)

  • directed graph, with non-negative weight/capacity
  • find the max-flow from the source s to the sink t

Ford-Fulkerson method

  • greedy algorithm
  • the notion of augmenting path
    • here we overload the notion of augmenting path on a general directed graph (instead of bipartite)
    • it is called augmenting path because once found, we can get additional flow from source to sink
  • method: keep finding augmenting paths in the residual graph
    • start with finding an arbitrary path from s to t
    • reverse it, resulting in a residual graph
    • find a new path from s to t on the residual graph, and reverse it to generate a new residual graph
    • repeat until no paths can be found, then the sum of capacities of all found paths is the max-flow
  • It is called "method" but not "algorithm" because how to find augmenting path is missing
    • depending on the actual method to find augmenting path, the algorithm efficiency can differ

Ways to find augmenting paths in a general directed graph

Dinic's Algorithm

  • new concepts: level graph
    • construct level graph: BFS from source to every node and compute hops
    • then find augmenting paths on level graphs
  • proposed by Dinic [1970]; fastest in practice O(n^2 m)
  • on bipartite graph, this algorithm is referred to as Hopcroft-Karp algorithm [1973]

Min-cost Flow Problem

  • in a flow network, each edge also has a cost
  • the goal is to maximizing the flow while minimizing the cost

Relationship to Linear Programming

  • most combinatorial algorithms can be represented in a LP form (and solved by LP solver efficiently)
  • we can convert a network flow problem into a LP problem
    • by formulating edges into linear constraints
  • we can also convert a LP problem into a network flow problem
    • by adding a virtual node, for example
    • the flow property can promise the equality constraints

The dual of max matching: min vertex cover

  • vertex cover: a set of vertices that includes at least one endpoint of every edge
  • in any bipartite graph, maximum matching = minimum vertex cover [Konig and Egercary 1931]

The dual of max flow: min cut

  • cut: a subset of edges that disconnects the source and the sink
  • minimum cut: the cut with minimum sum of edge capacities
  • in a flow network, maximum flow = minimum cut [Menger 1927]

Summary

  • MST algorithms: Prim’s, Kruskal’s
  • SSSP algorithms: Dijkstra’s, Bellman-Ford
  • SCC algorithms: Tarjan’s algorithm, BGSS algorithm
  • Bipartite graph matching: Hungarian algorithm
  • Max flow: Ford-Fulkerson method

Example Code: Graph and BFS

#include <iostream>
#include <vector>
#include <array>
#include <list>
#include <queue>
#include <stack>
#include <set>
#include <map>
#include <unordered_set>
#include <unordered_map>
#include <cmath>
#include <functional>
class MyGraph {
private:
// indices of vertices may not be consecutive
std::set<int> vertices; // std::set, std::map -> O(log(n)) using red-black tree
std::map<int, std::list<int>> edgeLists; // std::unordered_set, unordered_map -> O(1) using hashing
// switching to unordered_set/map can speed up 600ms --> 400ms
public:
MyGraph(){};
~MyGraph(){};
void addEdge(int src, int dest);
void printGraph();
void BFS(int startVertex, int& endVertex, int& depth);
};
void MyGraph::addEdge(int src, int dest) {
vertices.insert(src);
vertices.insert(dest);
edgeLists[src].push_back(dest);
edgeLists[dest].push_back(src); // undirected graph
}
void MyGraph::printGraph(){
std::cout << "vertices:" << std::endl;
for (const auto& v : vertices)
std::cout << v << " ";
std::cout << std::endl;
std::cout << "edges in adjacency lists:" << std::endl;
for (const auto& v : vertices){
std::cout << v << ": ";
for (const auto& e : edgeLists[v])
std::cout << e << " ";
std::cout << std::endl;
}
std::cout << std::endl;
}
void MyGraph::BFS(int startVertex, int& endVertex, int& depth) {
std::unordered_map<int, bool> visited (vertices.size()); // default to false
std::unordered_map<int, int> distance (vertices.size());
std::vector<int> visited_list;
std::queue<int> queue;
visited[startVertex] = true;
distance[startVertex] = 0;
queue.push(startVertex);
while (!queue.empty()) {
int currVertex = queue.front();
queue.pop();
visited_list.push_back(currVertex);
for (const auto& adjVertex : edgeLists[currVertex])
if (!visited[adjVertex]) {
visited[adjVertex] = true;
distance[adjVertex] = distance[currVertex] + 1;
queue.push(adjVertex);
}
}
endVertex = visited_list.back();
depth = distance[endVertex];
}
int main(){
int n;
std::cin >> n;
MyGraph graph;
int i, j;
for (int k = 0; k < n-1; ++k){
std::cin >> i >> j;
graph.addEdge(i, j);
}
// run BFS twice to find the diameter of the graph
// can be proved by triangle inequality
int startVertex, endVertex, depth;
startVertex = i;
graph.BFS(startVertex, endVertex, depth);
startVertex = endVertex;
graph.BFS(startVertex, endVertex, depth);
std::cout << (depth+1)/2 << std::endl; // ceiling
return 0;
}