Union-Find Data Structure

API and Client

A crucial first step in developing a data structure is to define its public API. For the Union-Find data type, we need to specify the operations it will support.

Public API

public class UF {
    // Initializes a union-find data structure with N objects
    // i.e (0 to N-1)
    UF(int N)
    
    // Adds a connection between p and q
    void union(int p, int q)
    
    // Returns the component identifier for p (0 to N-1)
    int find(int p)
    
    // Returns true if p and q are in the same component
    boolean connected(int p, int q)
}

A 1-line implementation of connected() can be achieved by checking if find(p) is equal to find(q).

Example Client

Here is a sample client that uses the UF data type to solve a dynamic connectivity problem. It reads an integer N and a sequence of pairs of integers, and prints the pairs that are not yet connected.

public static void main(String[] args) {
    int N = StdIn.readInt();
    UF uf = new UF(N);
    
    while (!StdIn.isEmpty()) {
        int p = StdIn.readInt();
        int q = StdIn.readInt();
        
        if (!uf.connected(p, q)) {
            uf.union(p, q);
            StdOut.println(p + " " + q);
        }
    }
}

This client can be tested with a file like tinyUF.txt:

The Idea of Connected Components

The core idea behind the Union-Find data structure is to keep track of a set of objects partitioned into a number of disjoint (non-overlapping) sets, which we call connected components.

The “is connected to” relation has three key properties:

Reflexive: p is connected to p.
Symmetric: If p is connected to q, then q is connected to p.
Transitive: If p is connected to q and q is connected to r, then p is connected to r.

A connected component is a set of objects where every object in the set is connected to every other object in the set.

Overview

Union−Find is a data structure that supports dynamic connectivity problems. This implementation explores:

Quick Find
Quick Union
Weighted Quick Union
Weighted Quick Union with Path Compression

We’ll apply these concepts to solve the percolation problem from physical chemistry.

Quick Find (Eager Approach)

In this approach, p and q are considered connected if and only if id[p] is equal to id[q].

Data Structure: An integer array id[] of size N.
Interpretation: p and q are in the same component if they have the same ID.

Quick Union (Lazy Approach)

This method is an improvement over Quick Find, designed to make the union operation faster.

Data Structure: An integer array id[] of size N.
Interpretation: id[i] is the parent of i. The root of a component is an element i where id[i] == i.
Find: To find the root of p, we chase the parent pointers until we reach a root.
Union: To merge components containing p and q, we set the id of p’s root to the id of q’s root.

Improvement 1: Weighted Quick-Union

To avoid the problem of tall trees in Quick Union, we use a weighted approach.

Strategy: Always link the root of the smaller tree to the root of the larger tree.
Data Structure: We maintain an extra array sz[i] to count the number of objects in the tree rooted at i.

Implementation:

int i = root(p);
int j = root(q);
if (i == j) return;
if (sz[i] < sz[j]) { id[i] = j; sz[j] += sz[i]; } 
else               { id[j] = i; sz[i] += sz[j]; }

Proposition: The depth of any node x is at most lg N.

Improvement 2: Path Compression

After finding the root of p, we can flatten the tree by making every other node in the path from p to the root point to its grandparent. This nearly flattens the tree.

Analysis of Tree Depth

The maximum depth of any node x is at most lg N (where lg is log base 2)

Proof

When tree T1 containing x merges with T2:
- The size of tree containing x at least doubles
- This occurs because x’s tree is always the smaller one
Depth only increases by 1 when merging with same-size tree

Depth Growth Pattern

Tree Size	Maximum Depth
1	0
2	1
4	2
8	3
16	4
32	5
N	lg(N)

Performance

Weighted Quick Union with Path Compression is linearithmic: O(N + M lg N)

N = number of objects
M = number of union-find operations

For detailed proof, see: Princeton’s Algorithms Course

Posted By Vaibhav Deokar

In Algorithms

On Jul 2, 2025

Tags Union-Find, Data Structures, Connectivity, Java

Previous Getting Started with Algorithms, Part I

Next Percolation