# Graph

## Graph Representation

## Graph Problems and Algorithms

In addition, we also have planar graphs, dynamic graphs, etc.

### Traversal-based Algorithms Review

Breadth-first Search (BFS)

Depth-first Search (DFS)

Topological Sort

Graph Connectivity

for each vertex, if not visited, run BFS/DFS on it

Algorithms based on DFS

Bi-connectivity, articulation points, bridges (CLRS pp. 621-622)

Topological sort (CLRS pp. 612-614)

Strongly connected components (CLRS pp. 615-618)

## Minimum Spanning Tree (MST)

undirected weighted graph

find n-1 edges that connected all vertices with minimum total weights

#### Light-edge property

The lightest edge in a cut must be in the MST

#### Prim's Algorithm

start from an arbitrary vertex, add the lightest connected edge to the tree, and repeat

this is a greedy algorithm

implementation can be array-based, binary heap, or Fibonacci heap (CLRS Section 19)

#### Kruskal’s Algorithm

scan edges by weight from lowest to highest, add the edge that doesn't introduce cycle, and repeat

this is a greedy algorithm

data structure: weighted edge list O(m log m) = O(m log n)

## Single-Source Shortest Paths (SSSP)

undirected graph, or directed acyclic graph (DAG)

find a shortest path from a given source vertex u to each vertex v

#### Bellman-Ford Algorithm

dynamic programming algorithm O(nm)

can work for negative weight

#### Dijkstra's Algorithm

start from an arbitrary node, expand the neighbor with the lowest cost, and repeat

greedy algorithm (similar to Prim's)

for positive weight only

implementation: array-based O(n^2), binary heap O((m+n) log n), or Fibonacci heap (m + n log n)

#### All-Pairs Shortest Paths (APSP)

find a shortest path from u to v for every pair of vertices u and v

although we can solve this by running a single-source algorithm once from each vertex, we usually can solve it faster (e.g., by Floyd-Warshall algorithm)

## Strongly Connected Components (SCC)

directed graph

strongly connected: if every vertex is reachable from every other vertex

find a partition such that each subgraph is strongly connected

#### Tarjan’s algorithm

is a DFS-based linear-time algorithm O(n + m)

kind of simple to implement, but counter-intuitive

hard to parallel

DFS is P-Complete (DFS is inherently sequential [Reif 1985])

#### Reachability-based algorithm

find SCC by reachability queries O(m log n)

what is the set of vertices that can be reached by a given vertex v

which vertices can reach v

the intersection of two sets form an SCC

can be parallelized (BGSS 2016, Yan Gu, code)

## Bipartite Graph Matching

**bipartite graph**: vertex set V can be partitioned into two subsets V1 and V2 such that each edge has one endpoint in V1 and the other endpoint in V2**matching**: a subset of the edges for which every vertex belongs to at most one of the edges**maximum matching**: a matching with the maximum number of edges**perfect matching**: a matching in which every vertex is matched

#### augmenting path

a structure such that reversing path direction can get one more matched edge

unmatched vertex --- matched vertex --- ... --- matched vertex --- unmatched vertex

how to find:

reverse the matched edges

start from an unmatched vertex, and try to find a path to another unmatched vertex

#### Hungarian algorithm

find the maximum matching for bipartite graphs in O(nm) time

find an augmenting path, reverse direction, get one more matched edge, and repeat

## Maximum Flow (Flow Network)

directed graph, with non-negative weight/capacity

find the max-flow from the source s to the sink t

#### Ford-Fulkerson method

greedy algorithm

the notion of augmenting path

here we overload the notion of augmenting path on a general directed graph (instead of bipartite)

it is called augmenting path because once found, we can get additional flow from source to sink

method: keep finding augmenting paths in the residual graph

start with finding an arbitrary path from s to t

reverse it, resulting in a residual graph

find a new path from s to t on the residual graph, and reverse it to generate a new residual graph

repeat until no paths can be found, then the sum of capacities of all found paths is the max-flow

It is called "method" but not "algorithm" because how to find augmenting path is missing

depending on the actual method to find augmenting path, the algorithm efficiency can differ

#### Ways to find augmenting paths in a general directed graph

#### Dinic's Algorithm

new concepts: level graph

construct level graph: BFS from source to every node and compute hops

then find augmenting paths on level graphs

proposed by Dinic [1970]; fastest in practice O(n^2 m)

on bipartite graph, this algorithm is referred to as Hopcroft-Karp algorithm [1973]

### Related Discussions on Flow Network

#### Min-cost Flow Problem

in a flow network, each edge also has a cost

the goal is to maximizing the flow while minimizing the cost

#### Relationship to Linear Programming

most combinatorial algorithms can be represented in a LP form (and solved by LP solver efficiently)

we can convert a network flow problem into a LP problem

by formulating edges into linear constraints

we can also convert a LP problem into a network flow problem

by adding a virtual node, for example

the flow property can promise the equality constraints

#### The dual of max matching: min vertex cover

vertex cover: a set of vertices that includes at least one endpoint of every edge

in any bipartite graph, maximum matching = minimum vertex cover [Konig and Egercary 1931]

#### The dual of max flow: min cut

cut: a subset of edges that disconnects the source and the sink

minimum cut: the cut with minimum sum of edge capacities

in a flow network, maximum flow = minimum cut [Menger 1927]

## Summary

MST algorithms: Prim’s, Kruskal’s

SSSP algorithms: Dijkstra’s, Bellman-Ford

SCC algorithms: Tarjan’s algorithm, BGSS algorithm

Bipartite graph matching: Hungarian algorithm

Max flow: Ford-Fulkerson method

## Example Code: Graph and BFS

Last updated