Algorithms-Advanced-Union-Find-notes

Lazy Unions

The Union-Find Data Structure
FIND: Given $x\in X$ , return na,e of x’s group.
UNION: Given x & y, merge groups containing them.
Previous solution.(for Kruskal’s MST algorithm)
$-$ Each $x \in X$ points directly to the “leader” of its grou[.
$-$ O(1) FIND [just return x’s leader]
$-$ O(nlog(n)) total work for n UNIONS[when 2 groups merge,
smaller group inherits leader of large one]
Lazy Unions
New idea: Update only one pointer each merge.
In array representation:
(Where $A[i] \leftrightarrow$ name of $i's$ parent.
How to Merge
In general: When two groups merge in a UNION, make one group’s leader
[root of the tree] a child of the other one.
Pro: UNION reduces to 2 FINDS[r1 = FIND(x), r2 = FIND(y)] and $O(1)$ extra work [link r1, r2 together]
Con: To recover leader of an object, need to follow a path of parent pointers[not just one]
$\Rightarrow$ Not clear if FIND still takes $O(1)$ time.

Union-Find (Union by Rank)
The lazy Union Implementation
New implementation:
Each object $x \in X$ has a parent field.
Invariant: Parent pointers induce a collection of directed trees on X.
(x is a root $\Leftrightarrow$ parent[x] = x)
Initially: For all x, parent[x] = x;
FIND(x): Traverse parent pointers from x until you hit the root.
UNION(x,y): $s_1$ = FIND(x); $s_2=FIND(y)$ ; Reset parent of one of $s_1,s_2$ to be the other.
Union by rank
Ranks: For each $x\in X$ , maintain field rank[x].
[In general rank[x] = 1+ (max rank of x’s children)]
Invarant (for now): For all $x \in X$ , rank[x] = maximum number of hops from some leaf to x.
[Initially, rank[x] = 0 for all $x \in X$ ]
To avoid scraggly trees.Given x & y:
$-$ $s_1$ = FIND(x), $s_2$ = FIND(y)
$-$ If rank[ $s_1$ ] > rank[ $s_2$ ] then set parent[ $s_2$ ] to $s_1$
else set parent[ $s_1$ ] to $s_2$ .

Properties of Ranks
Recall: Lazy Unions.
Invariant (for now): rank[x] = max # of hops from a leaf to x.
[Note $max_x rank[x] \approx$ worst-case running time of FIND].
Union by Rank: Make old root with smaller rank child of the root with larger rank.
[Choose new root arbitrarily in case of a tie, and add 1 to its rank.]
Immediate from Invariant/Rank Maintenance:
(1) For all object x, rank[x] only goes up over time
(2) Only ranks of roots can go up.
[once x a non-root, rank[x] frozen forevermore]
(3) Ranks strictly increase along a path to the root.
Rank Lemma
Rank Lemma: Consider an arbitrary sequence of UNION(+ FIND)
operations. For every $r \in \{ 0,1,2,... \}$ , there are at most $\frac{n}{2^r}$ objects with rank $r$ .
Corollary(推论): Max rank always $\leq log_2 n$
Corollary(推论): Worst-case running time of FIND, UNION is O(log n).
Proof of Rank Lemma:
Claim 1: If x, y have the same rank $r$ , then their subtrees are disjoint.
Claim 2: The subtree of a rank-r object has size $\geq 2^r$ .
[Note Claim 1 + Claim 2 imply the Rank Lemma].

Path Compression
Idea: Why bother traversing a leaf-root path multiple times?
Path compression: After FIND(x), install shortcuts(i,e, revise pointers)
to x’s root all along the x $\rightarrow$ root path.
Con: Constant-factor overhead to FIND
Pro: Speeds up subsequent FINDs.
On Ranks
Important: Maintain all rank fields EXACTLY as without path compression.
$-$ Rank initially all 0.
$-$ In UNION, new root = old root with bigger rank.
$-$ When mergeing two nodes of common rank $r$ , reset new root’s rank to $r+1$ .
Bad news, Now rank[x] is only an upper bound on the maximum number of hops on a path from a leaf to x.
Good news: Rank Lemma still holds( $\leq \frac{n}{2^r}$ objects with rank r)
Also: Still always have rank[parent[x]] > rank[x] for all non-roots x.
Hopcroft-Ullman Theorem
Theorem: With Union by Rank and path compression, m Union + Find operation takes $O(mlog^* n)$ time, where $log^* n=$ the number of times you need to apply log to n before the result it $\leq 1$ .

Algorithms-Advanced-Union-Find-notes

猜你喜欢