Gt Ga Notes Gr
Strongly Connected Components (GR1)
Here is a recap on DFS (for a undirected graph)
1
2
3
4
5
6
7
8
9
DFS(G):
input G=(V,E) in adjacency list representation
output: vertices labeled by connected components
cc = 0
for all v in V, visited(v) = False, prev(v) = NULL
for all v in V:
if not visited(v):
cc++
Explore(v)
Now lets define Explore
:
1
2
3
4
5
6
7
Explore(z):
ccnum(z) = cc
visited(z) = True
for all (z,w) in E:
if not visited(w):
Explore(w)
prev(w) = z
The ccnum
is the connected component number of
The overall running time is
What if our graph is now directed? We can still use DFS, but add pre or postorder numbers and remove the counters.
Notice hte difference nad adding of the clock
variable:
1
2
3
4
5
6
7
DFS(G):
clock = 1
for all v in V, visited(v) = False, prev(v) = NULL
for all v in V:
if not visited(v):
cc++
Explore(v)
Now lets define Explore
:
1
2
3
4
5
6
7
8
Explore(z):
pre(z) = clock;clock++
visited(z) = True
for all (z,w) in E:
if not visited(w):
Explore(w)
prev(w) = z
post(z) = clock; clock++
Here is an example:
Assuming we start at B, this is how our DFS looks like define as Node(pre,post):
Blue edge reflects that it is “used” during DFS but not used because the node has been visited.
There are various type of edges, for a given edge
- Treeedge such as
- where
, because of how you recurse back up during DFS
- where
- Back edges
- Edges that goes back up.
- Forward edges
- Same as the tree edges
- Cross edges
Notice that only for the back edges it is different in terms of the post order and behaves differently from the other edges.
Cycles
A graph
Proof:
Given
For the other direction, it is obvious, Consider the back edge
Toplogical sorting
Topologically sorting a DAG (directed acyclic graph that has no cycles): order vertices so that all edges go from lower
So, to do this, we can order vertices by decreasing post order number. Note that for this case, since we have
What are the valid Topological order? :
Note, for instance, XYUWZ is not a valid topological order! (Basically Z must come before W) - Because there’s a directed edge from Z to W, Z must come before W in any valid topological ordering of this graph.
Analogy: Imagine you’re assembling a toy. Piece Y is needed for both piece Z and piece W. But, piece Z is also needed for piece W. You must attach Y to Z first, then Z to W. Even though Y is needed for W, you don’t attach it directly to W.
DAG Structure
- Source vertex: no incoming edges
- Highest post order
- Sink vertex = no outgoing edges
- Lowest post order
By now, you are probably thinking, what does all this has to do with strongly connected component (SCC)? We will show that it is possible to do so with two DFS search.
Connectivity in DAG
Vertices
So, SCC defined as strongly connected component, is the maximal set of strongly connected vertices.
Example:
How many strongly connected components (SCC) does the graph has? 5
-
A
is a SCC by itself since it can reach many other nodes but no other nodes can reach A -
{H,I,J,K,L}
Can reach each other {C,F,G}
{B,E}
{D}
We can simplify the above graph to the following meta graph:
Notice that this meta graph is a DAG and it is always the case. This should be obvious because if two strongly connected components are involved in a cycle, then they will be combined to form a bigger SCC.
So, every directed graph is a DAG of it’s strongly connected components. You can take any graph, break it up into SCC and then topologically sort this SCC so that all edges go left to right.
Motivation
There are many ways we can do this, such as start with sinking vertices or source vertices. But, instead we can find the sink SCC, so, we find SCC S, output it, and remove it and repeat it. It turns out, Sinks SCC are easier to work with!
Recall we take any Explore(v)
, and we run explore from any of of the vertices in {H,I,J,K,L}
, we will explore these vertices and not any other because it is a sink SCC.
What if we find a vertex in the source component? You ended up exploring the whole graph, too bad! So, how can we be smart about this and find a vertex that lies in a sink component? Because if we can do so (somehow magically select a vertex in the sink SCC), then we are guaranteed to find all the nodes in the sink SCC.
So, how can we find such a vertex?
Recall that in a DAG, the vertex with the lowest postorder number is a sink.
In a directed directed G, can we use the same property? Does the property for a general graph, such that v with the lowest post order always lie in a sink SCC? HA of course its not true, do you think it will be that easy?
Notice that B has the lowest post order (3) but it belongs to the SCC {A,B}
.
What about the other way around? Does v with the highest post order always lie in a source SCC? Turns out, this is true! How can we make use of this? Simple, just reverse it! So the source SCC of the reverse graph is the sink SCC!
So, for directed
Example of SCC
lets consider the same graph but now we reverse it, and we start at node C
We start from c from the reverse graph {C, G, F, B, A, E}
, then we proceed to {D}
before going to {L}
. you may notice that the choosing of which vertex might be important, suppose you pick at any vertex at {H,I,J,K,L}
, it will be able to reach all vertices except {D}
so that starting vertex will still have the highest post number.
Then, we sort it and get this following order: L, K, J, I, H, D, C, B, E, A, G, F
. So now, we run DFS from the original graph starting at
- At first step, we start at
, so, we will reach{L,K,J,I,H}
label as 1 and strike them out. - We then visit
, and reach{D}
, label as 2 and strike them out - We do the same for
, reach{C,F,G}
label as 3 and strike them out - Then do the same for
{B,E}
label as 4 - and finally
{A}
label as 5
So, the L, K, J, I, H, D, C, B, E, A, G, F
is mapped to {1,1,1,1,1,2,3,4,4,5,3,3}
. Another interesting observation of the new labels {1,2,3,4,5}
, we have the following graph:
Notice that this metagraph, they go from {1,1,1,1,1,2,3,4,4,5,3,3}
also outputs the topological order in reverse order! So we can take any graph, run two iterations of DFS, finds its SCC, and structure these SCC in topological order.
SCC algorithm
1
2
3
4
5
6
SCC(G):
input: directed G=(V,E) in adjacency list
1. Construct G^R
2. Run DFS on G^R
3. Order V by decreasing post order number
4. Run undirected connected components alg on G based on the post order number
Proof of claim:
Given two SCC
The first case is if we start from
The second case is if we start
BFS & Dijkstras
DFS : connectivity
BFS:
input:
output: for all
Dijkstra’s:
input:
output:
Note, Dijkstra uses the min-heap (known as priority queue) that takes
2-Satisfiability (GR2)
Boolean formula:
variables with literals where- We use
for the and condition and for the or condition.
CNF
Now, we define CNF (conjunctive normal form):
Clause: OR of several literals
Notice that for F to be true, that means for each condition we need at least one literal to be true.
SAT
Input: formula f in CNF with
output: assignment (assign T or F to each variable) satisfying if one exists, NO if none exists.
Example:
And an example that will work is
K-SAT
For K sat, the input is formula f in CNF with
- SAT is NP-complete
- K-SAT is NP complete
- Poly-time algorithm using SCC for 2-SAT
For example consider the following input f for 2-SAT:
We want to simplify unit-clause which is a clause with 1 literal such as
- Take a unit clause say literal
- Satisfy it (set
) - Remove clauses containing
and drop - let
be the resulting formula
For example:
So, the original
SAT-graph
Take
vertices corresponding to edges corresponding to 2 “implications” per clause
Consider the following example:
- Notice that if we set
, and likewise
In general given
If we observe the graph, we notice that there is a path from
In general:
- If for some
, are in the same SCC, then is not satisfiable. - If for some
, are in different SCC, then is satisfiable.
2-SAT Algo
- Take source scc
and set - Take sink scc
and set - When we done this, then we can remove all these literals from the graph!
This works because of a key fact: if
1
2
3
4
5
6
2SAT(F):
1. Construct graph G for f
2. Take a sink SCC S
- Set S = T ( and bar(S) = F)
- remove S, bar(S)
- repeat until empty
Proof:
The first claim is we show that path
Take path
Recall that
Using this claim, we can show 2 more things:
If
It remains to show that
We take a sink SCC S, for
MST (GR3)
For the minimum spanning tree, we are going to go through the krusal algorithm but mainly focus on the correctness of it. Note that krusal algorithm is a greedy algorithm which makes use of the cut property. This property is also useful in proving prim’s algorithm.
MST algorithm
Given: undirected
Goal: find minimum size, connected subgraph of minimum weight. This connected subgraph is known as the spanning tree (refer this as T). So we want
1
2
3
4
5
6
7
8
Kruskals(G):
input: undirected G = (V,E) with weights w(e)
1. Sort E by increasing weight
2. Set X = {null}
3. For e=(v,w) in E (in increasing order)
if X U e does not have a cycle:
X = X U e (U here denotes union)
4. Return X
Runtime analysis:
- Step one takes
time where- This was a little confusing to me and why the lecture said
. It turns out that the max of is for a fully connected graph,so .
- This was a little confusing to me and why the lecture said
- For this we can make use of union-find data structure.
- Let
be the component containing in - Let
be the component containing in - We check if
(by them having different representative), then add to . - We then apply union to both of them.
- Let
- The union-find data structure takes
time- Since we are doing it for all edges, then
.
- Since we are doing it for all edges, then
Cut property
To prove the correctness, we first need to define the cut property:
In other words, the cut of a graph is a set of edges which partition the vertices into two sets. In later part, we will look at problems such as minimum/maximum cut to partition the graphs into two components.
The core of the proof is:
- Use induction, and assume that
where for a MST . The claim is when we add an edge from , we form another MST .
Proof outline
So, we need to consider two cases, if
- If
, our job is done, as there is nothing to show. - If
(such as the diagram above), then we modify in order ot add edge and construct a new MST
Next, we show that
- Remember if a tree with size
has edges then it must be connected.
- Actually, it turns out that
, otherwise it would contradict the fact that is a MST.
Prim’s algorithm
MST algorithm is akin tio Dijkstra’s algorithm, and use the cut property to prove correctness of Prim’s algorithm.
The prim’s algorithm selects the root vertex in the beginning and then traverses from vertex to vertex adjacently. On the other hand, Krushal’s algorithm helps in generating the minimum spanning tree, initiating from the smallest weighted edge.