# Graph

## Graph Representation

![](/files/-Mcf6Ymr02o1OaYHVQBb)

## Graph Problems and Algorithms

![](/files/-Mcf9Fdocg84G5Q_Gg4n)

In addition, we also have planar graphs, dynamic graphs, etc.

### Traversal-based Algorithms Review

* Breadth-first Search (BFS)
* Depth-first Search (DFS)
* Topological Sort
* Graph Connectivity
  * for each vertex, if not visited, run BFS/DFS on it
* Algorithms based on DFS
  * Bi-connectivity, articulation points, bridges (CLRS pp. 621-622)
  * Topological sort (CLRS pp. 612-614)
  * Strongly connected components (CLRS pp. 615-618)

## Minimum Spanning Tree (MST)

* undirected weighted graph
* find n-1 edges that connected all vertices with minimum total weights

#### Light-edge property

* The lightest edge in a cut must be in the MST

#### Prim's Algorithm

* start from an arbitrary vertex, add the lightest connected edge to the tree, and repeat
* this is a greedy algorithm
* implementation can be array-based, binary heap, or Fibonacci heap (CLRS Section 19)

![](/files/-McfPe4u6vzUY6nM6hZe)

#### Kruskal’s Algorithm

* scan edges by weight from lowest to highest, add the edge that doesn't introduce cycle, and repeat
* this is a greedy algorithm
* data structure: weighted edge list O(m log m) = O(m log n)

![](/files/-McfQJy-kazTgwZy5ljc)

```c
// linked forest + path compression 
find(x)     { return label[x] = x==label[x] ? x : find(label[x]); }
union(x, y) { if (find(x) != find(y)) label[label[y]] = label[x]; }
test(x, y)  { return (find(x) == find(y)); }
```

## Single-Source Shortest Paths (SSSP)

* undirected graph, or directed acyclic graph (DAG)
* find a shortest path from a given source vertex u to each vertex v

#### Bellman-Ford Algorithm

* dynamic programming algorithm O(nm)
* can work for negative weight
* $$D\_{i,k} = \min \begin{cases} D\_{i, k-1} \ \min\_{(j, i)\in E} {D\_{j, k-1} + w(j, i)}\end{cases}$$&#x20;

```cpp
// Bellman-Ford Algorithm
for i=1 to n do
  D[i]=MAX
D[s]=0
for k=1 to n-1 do
  for each (i,j) in E do
    if (D[i]+w(i,j)<D[j])
      D[j]=D[i]+w(i,j), from[j]=i
// optional optimization: if no distance is updated, break
```

#### Dijkstra's Algorithm

* start from an arbitrary node, expand the neighbor with the lowest cost, and repeat
* greedy algorithm (similar to Prim's)
* for positive weight only
* implementation: array-based O(n^2), binary heap O((m+n) log n), or Fibonacci heap (m + n log n)

#### All-Pairs Shortest Paths (APSP)

* find a shortest path from u to v for every pair of vertices u and v
* although we can solve this by running a single-source algorithm once from each vertex, we usually can solve it faster (e.g., by Floyd-Warshall algorithm)

![Relax operation: update (u, v) path by weight w if better (CLRS pp. 648)](/files/-Mck-umC7n33WCPmo4AM)

## Strongly Connected Components (SCC)

* directed graph
* strongly connected: if every vertex is reachable from every other vertex
* find a partition such that each subgraph is strongly connected

#### Tarjan’s algorithm

* is a DFS-based linear-time algorithm O(n + m)
* kind of simple to implement, but counter-intuitive
* hard to parallel
  * DFS is P-Complete (DFS is inherently sequential \[Reif 1985])

#### Reachability-based algorithm

* find SCC by reachability queries O(m log n)
  * what is the set of vertices that can be reached by a given vertex v
  * which vertices can reach v
  * the intersection of two sets form an SCC
* can be parallelized (BGSS 2016, Yan Gu, [code](https://github.com/ParAlg/gbbs/tree/master/benchmarks/StronglyConnectedComponents/BGSS16-Filtering))
* [benchmark suite for graph algorithms](https://github.com/ParAlg/gbbs)

![](/files/-McmcyejMVYpEGfocY4p)

## Bipartite Graph Matching

* **bipartite graph**: vertex set V can be partitioned into two subsets V1 and V2 such that each edge has one endpoint in V1 and the other endpoint in V2
* **matching**: a subset of the edges for which every vertex belongs to at most one of the edges
* **maximum matching**: a matching with the maximum number of edges
* **perfect matching**: a matching in which every vertex is matched

#### augmenting path

* a structure such that reversing path direction can get one more matched edge
  * unmatched vertex --- matched vertex --- ... --- matched vertex --- unmatched vertex&#x20;
* how to find:
  * reverse the matched edges
  * start from an unmatched vertex, and try to find a path to another unmatched vertex

#### Hungarian algorithm

* find the maximum matching for bipartite graphs in O(nm) time
* find an augmenting path, reverse direction, get one more matched edge, and repeat

```cpp
// a DFS-based implementation
// g[i][j] is the graph, p[j]=i stores the matched edge, 
// and b[j] is a temp boolean array
bool find(i)
  for j=1 to m do
    if g[i][j] == 1 and b[j] == false
      b[j] = true
      if p[j] == -1 or find(p[j])
        p[j]=i, return true
return false
Hungarian()
  p[1..n] = -1
  for i=1 to n do
    b[1..n] = false, find(i)
```

![](/files/-Mcmpb2Mav3E0wFx-4PH)

## Maximum Flow (Flow Network)

* directed graph, with non-negative weight/capacity
* find the max-flow from the source s to the sink t

#### Ford-Fulkerson method

* greedy algorithm
* the notion of augmenting path
  * here we overload the notion of augmenting path on a general directed graph (instead of bipartite)
  * it is called augmenting path because once found, we can get additional flow from source to sink
* method: keep finding augmenting paths in the residual graph
  * start with finding an arbitrary path from s to t&#x20;
  * reverse it, resulting in a residual graph
  * find a new path from s to t on the residual graph, and reverse it to generate a new residual graph
  * repeat until no paths can be found, then the sum of capacities of all found paths is the max-flow
* It is called "method" but not "algorithm" because how to find augmenting path is missing
  * depending on the actual method to find augmenting path, the algorithm efficiency can differ

![](/files/-Mcmrq1nJ_3D0o2WSNR4)

![](/files/-McmtQwq4s9oHgu4G1KI)

#### Ways to find augmenting paths in a general directed graph

![](/files/-McmpCkINEd2RC7Uavsq)

#### Dinic's Algorithm

* new concepts: level graph
  * construct level graph: BFS from source to every node and compute hops
  * then find augmenting paths on level graphs
* proposed by Dinic \[1970]; fastest in practice O(n^2 m)
* on bipartite graph, this algorithm is referred to as Hopcroft-Karp algorithm \[1973]

### Related Discussions on Flow Network

#### Min-cost Flow Problem

* in a flow network, each edge also has a cost
* the goal is to maximizing the flow while minimizing the cost

#### Relationship to Linear Programming

* most combinatorial algorithms can be represented in a LP form (and solved by LP solver efficiently)
* we can convert a network flow problem into a LP problem
  * by formulating edges into linear constraints
* we can also convert a LP problem into a network flow problem
  * by adding a virtual node, for example
  * the flow property can promise the equality constraints

#### The dual of max matching: min vertex cover

* vertex cover: a set of vertices that includes at least one endpoint of every edge
* in any bipartite graph, maximum matching = minimum vertex cover  \[Konig and Egercary 1931]

#### The dual of max flow: min cut

* cut: a subset of edges that disconnects the source and the sink
* minimum cut: the cut with minimum sum of edge capacities
* in a flow network, maximum flow = minimum cut  \[Menger 1927]

## Summary

* MST algorithms: Prim’s, Kruskal’s
* SSSP algorithms: Dijkstra’s, Bellman-Ford
* SCC algorithms: Tarjan’s algorithm, BGSS algorithm
* Bipartite graph matching: Hungarian algorithm
* Max flow: Ford-Fulkerson method

## Example Code: Graph and BFS

```cpp
#include <iostream>
#include <vector>
#include <array>
#include <list>
#include <queue>
#include <stack>
#include <set>
#include <map>
#include <unordered_set>
#include <unordered_map>
#include <cmath>
#include <functional>


class MyGraph { 
  private:
    // indices of vertices may not be consecutive
    std::set<int> vertices;                  // std::set, std::map -> O(log(n)) using red-black tree
    std::map<int, std::list<int>> edgeLists; // std::unordered_set, unordered_map -> O(1) using hashing
                                             // switching to unordered_set/map can speed up 600ms --> 400ms
  public:
    MyGraph(){};
    ~MyGraph(){};
    void addEdge(int src, int dest);
    void printGraph();
    void BFS(int startVertex, int& endVertex, int& depth);
};

void MyGraph::addEdge(int src, int dest) {
  vertices.insert(src);
  vertices.insert(dest);
  edgeLists[src].push_back(dest);
  edgeLists[dest].push_back(src);  // undirected graph 
}

void MyGraph::printGraph(){
  std::cout << "vertices:" << std::endl;
  for (const auto& v : vertices)
    std::cout << v << " ";
  std::cout << std::endl;
  std::cout << "edges in adjacency lists:" << std::endl;
  for (const auto& v : vertices){
    std::cout << v << ": ";
    for (const auto& e : edgeLists[v])
      std::cout << e << " "; 
    std::cout << std::endl;
  }
  std::cout << std::endl;
}

void MyGraph::BFS(int startVertex, int& endVertex, int& depth) {
  std::unordered_map<int, bool> visited (vertices.size());  // default to false
  std::unordered_map<int, int> distance (vertices.size());
  std::vector<int> visited_list;
  std::queue<int> queue;
  visited[startVertex] = true;
  distance[startVertex] = 0;
  queue.push(startVertex);
  while (!queue.empty()) {
    int currVertex = queue.front();
    queue.pop();
    visited_list.push_back(currVertex);
    for (const auto& adjVertex : edgeLists[currVertex])
      if (!visited[adjVertex]) {
        visited[adjVertex] = true;
        distance[adjVertex] = distance[currVertex] + 1;
        queue.push(adjVertex);
      }
  }
  endVertex = visited_list.back();
  depth = distance[endVertex];
}

int main(){
  int n;
  std::cin >> n; 
  MyGraph graph;
  int i, j;
  for (int k = 0; k < n-1; ++k){
    std::cin >> i >> j;
    graph.addEdge(i, j);
  }

  // run BFS twice to find the diameter of the graph
  // can be proved by triangle inequality
  int startVertex, endVertex, depth;
  startVertex = i;
  graph.BFS(startVertex, endVertex, depth);

  startVertex = endVertex;
  graph.BFS(startVertex, endVertex, depth);

  std::cout << (depth+1)/2 << std::endl;  // ceiling

  return 0;
}

```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://wiki.hanzheteng.com/algorithm/cs/graph.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
