Finding the lowest common ancestor (LCA) using Euler tour and sparse table
Algorithms #algorithms #trees #lcaEuler tour
During an Euler tour, each node is added to the tour as it is visited (either moving from parent to child or returning from child to parent). The tour starts at the root node and reaches back to it after visiting all the vertices.
Euler tour requires
2*N-1
vertices to store the tour.
Lowest common ancestor (LCA)
The lowest common ancestor (LCA) of two nodes in a tree is the lowest (i.e., deepest) node that is a common ancestor of both nodes.
In other words, it is the node that is furthest from the root of the tree and still has both nodes as descendants.
One common approach to solving this problem is to use the Euler tour of the tree.
By storing the indices of the first and last occurrence of each node in the Euler tour,
it is possible to determine whether a given node is an ancestor of another node and to
find the LCA of two nodes in O(1)
time.
Algorithm
The LCA problem can be reduced to Range Minimum Query (RMQ) problem. With RMQ, LCA can be found in constant
time O(1)
after doing some preprocessing (creating sparse table) which takes O(n lon n)
time.
Preprocessing time can be improved to O(n) with Farach-Colton and Bender optimization.
Steps
The algorithm can be split into two steps
Step 1
- Ensure all nodes are uniquely indexed e.g. nodes can be assigned id from
0
toN-1
- Start by creating euler tour of the tree
- While creating euler tour, maintain a separate array/list to keep track of the depth of each node (on every visit) in the tour
Step 2
- Create sparse table from depth array for RMQ. It should return the
index
of minimum value - Pick indices of the nodes from euler tour whose LCA needs to be found
- To get the LCA of the nodes, find the minimum value in the depth array between these indices. The returned value is the index of the LCA in the Euler tour
The steps are illustrated below
Illustration Step 1. Find euler tour and track depth for each node in the tour
Illustration Step 2. Find LCA of two nodes
We want to find LCA of (3, 2)
in the tree give above. The euler tour and depth is already tracked
in the table above.
Example
- Based on the tree given above, this script attempts to find the LCA of
(12, 14)
,(6, 8)
&(12, 2)
Check out the following articles for details on RMQ using sparse tables and array representation of a binary tree.
import kotlin.math.floor
import kotlin.math.log
import kotlin.math.max
import kotlin.math.min
//sampleStart
fun main() {
// click + to see full script
// a binary tree represented as an array
val tree = intArrayOf(4, 9, 2, 10, 0, 6, 3, 13, 12, 5, 14, 7, 1, 8, 11).toTypedArray()
val (tourNodesDepth, eulerTour, lastIndexOfEulerNode) = createEulerTourAndDepthArray(tree)
val spareTable = SpareTable(tourNodesDepth)
// get lowest common ancestor
fun getLca(node1: Int, node2: Int): Int {
val index = spareTable.queryMinIndex(
lastIndexOfEulerNode[node1]!!,
lastIndexOfEulerNode[node2]!!
)
return eulerTour[index]
}
println("lca 12,14 = ${getLca(12, 14)}")
println("lca 6,8 = ${getLca(6, 8)}")
println("lca 12,2 = ${getLca(0, 6)}")
}
//sampleEnd
data class Result(
val tourNodesDepth: Array<Int>,
val eulerTour: ArrayList<Int>,
val lastIndexOfEulerNode: Map<Int, Int>
) {
// todo: override equals and hashcode
}
fun createEulerTourAndDepthArray(tree: Array<Int>): Result {
var depthIndex = 0
var eulerIndex = 0
// no. of nodes in euler tour = (2 * no. of nodes) - 1
val noOfEulerNodes = tree.size.shl(1).minus(1)
val tourNodesDepth = Array(noOfEulerNodes) { 0 }
val eulerTour = ArrayList<Int>(noOfEulerNodes)
/**
* keep track of last index of a node appeared in euler tour. e.g. if
* tour produces [0, 1, 2, 0, 1], the index of node 1 stored in this
* map will be 4.
*/
val lastIndexOfEulerNode = mutableMapOf<Int, Int>()
fun dfs(startNodeIndex: Int, level: Int) {
// leftChildIndex = (startNodeIndex * 2) + 1
val leftChildIndex = startNodeIndex.shl(1).plus(1)
// rightChildIndex = (startNodeIndex * 2) + 2
val rightChildIndex = startNodeIndex.shl(1).plus(2)
tourNodesDepth[depthIndex++] = level
eulerTour.add(tree[startNodeIndex])
lastIndexOfEulerNode[tree[startNodeIndex]] = eulerIndex++
if (tree.getOrNull(leftChildIndex) != null) {
dfs(leftChildIndex, level + 1)
eulerTour.add(tree[startNodeIndex])
tourNodesDepth[depthIndex++] = level
lastIndexOfEulerNode[tree[startNodeIndex]] = eulerIndex++
}
if (tree.getOrNull(rightChildIndex) != null) {
dfs(rightChildIndex, level + 1)
eulerTour.add(tree[startNodeIndex])
tourNodesDepth[depthIndex++] = level
lastIndexOfEulerNode[tree[startNodeIndex]] = eulerIndex++
}
}
dfs(0, 0)
return Result(
tourNodesDepth = tourNodesDepth,
eulerTour = eulerTour,
lastIndexOfEulerNode = lastIndexOfEulerNode
)
}
class SpareTable(private val arr: Array<Int>) {
private val table: Array<Array<Int>>
private val indexTable: Array<Array<Int>>
init {
val size = arr.size
val rows = floor(log(size.toDouble(), 2.0)).toInt().plus(1)
table = Array(rows) { Array(arr.size) { 0 } }
indexTable = Array(rows) { Array(arr.size) { it } }
System.arraycopy(arr, 0, table[0], 0, arr.size)
for (row in 1 until rows) {
var colum = 0
while (colum.plus(1.shl(row)) <= size) {
val left = table[row - 1][colum]
val right = table[row - 1][colum.plus(1.shl(row - 1))]
table[row][colum] = min(left, right)
if (table[row][colum] == left) {
indexTable[row][colum] = indexTable[row - 1][colum]
} else {
indexTable[row][colum] = indexTable[row - 1][colum.plus(1.shl(row - 1))]
}
colum++
}
}
}
fun queryMinIndex(a: Int, b: Int): Int {
val left = min(a, b)
val right = max(a, b)
val i = floor(log((right.minus(left).plus(1)).toDouble(), 2.0)).toInt()
val rightIndex = right.minus(1.shl(i)).plus(1)
return if (table[i][left] < table[i][rightIndex]) {
indexTable[i][left]
} else {
indexTable[i][rightIndex]
}
}
}
Conclusion
There are also several other algorithms for finding the LCA of two nodes in a tree,
such as the binary lifting
algorithm and the Tarjan's off-line
lowest common ancestors
algorithm. These algorithms have different time and space complexities,
but they generally have a time complexity of O(log n)
or O(1)
after a pre-processing step.
Usage
The LCA problem is a fundamental problem in graph theory and has many applications, such as in computing the distance between nodes in a tree or in constructing efficient data structures for storing and querying trees.
One example of a data structure that uses the LCA problem is the binary indexed tree (BIT),
also known as the Fenwick tree. A BIT is a data structure that allows efficient updates and
queries of a dynamic array. The basic idea is to represent the array as a tree, where each
node represents the sum of a range of values in the array. The LCA problem is used to
compute the range of values that need to be updated or queried at each step,
allowing the BIT to perform updates and queries in O(log n)
time, where n is the size
of the array.
Another example of a data structure that uses the LCA problem is the lowest common
ancestor binary search tree (LCA-BST), also known as the level ancestor data structure.
An LCA-BST is a binary search tree that is augmented with additional information to allow
efficient LCA queries. Specifically, each node in the tree stores a pointer to its kth ancestor,
where k is a power of two. The LCA problem is used to compute the kth ancestor of a node,
allowing the LCA-BST to perform LCA queries in O(log n)
time and space, where n is the number
of nodes in the tree.