Heap data structure, heap sort , heapify and priority queues
Data-structures #heap #sorting #data-structuresBinary Tree
Binary tree is a type of tree
where each node can have at most 2 children.
Perfect or Full binary tree
Full binary tree is a type of
binary tree
where each parent node has either two or no children
Or
A full binary tree contains the maximum number of nodes for the height of that tree i.e. adding another node will increase the height of the tree
Keeping in mind that the height of a binary tree is equal to 2h+1-1, a complete binary with
height h=3
should contain 7
nodes. (see example B below)
In the following examples, tree A
& B
are full binary trees but tree C
& D
are not (find out why?)
Complete binary tree
A complete binary tree is a binary tree in which every level, except possibly the last, is completely filled, and all nodes in the last level are as far left as possible - wikipedia
A complete binary tree needs to satisfy these two properties
- A complete binary tree is a
full binary tree
up to the levelh-1
- On last level it can have between
1
to2h
nodes but nodes are filled fromleft
to right.
In the following examples A
& B
are complete binary trees because they satisfy the above two properties.
C
& D
are not complete binary trees because C
does not satisfy the 2nd property while D
doesn’t satisfy any of the above-mentioned properties.
Height of a complete binary tree will always be minimum that is
log n
Array representation of binary tree
If a binary tree is represented by array then for any node present at index i
, it’s left child will be at (i * 2) + 1
, right will be at (i * 2) + 2
and prent will be at ⌊(i - 1) / 2⌋
.
node = i
left-node = (i * 2) + 1
right-node = (i * 2) + 2
parent-node = ⌊(i - 1) / 2⌋ // floor value
Example 1.
When a
complete binary tree
is represented by an array (as in this example) then it willalways
have consecutivenon-null
values on the left side of the array but it can havenulls
at the end of array
Example 2.
Note that even the left nodes of
2
and7
are not in the tree but still there should be an index reserved for these nodes. For example, if3
(right child of2
) is stored at index4
then it will become left child of node7
which is wrong.
Heap
Heap is a complete binary tree
in which elements can be arranged in two ways
- If elements are arranged in such a way that every node has a value
greater
than all of its descendants then it’s calledmax-heap
In max heap, root element will always have the
maximum
value - If elements are arranged in such a way that every node has a value
smaller
than all of its descendants then it’s calledmin-heap
In min heap, root element will always have the
smallest
value
- Max-heap example
- Min-heap example
Heap representation in code
Heap can be best implemented using array representation of binary tree which we have covered in this article. For the rest of this article, we will be using array representation of binary tree to represent, insert, delete items in heap (see example below).
Heap insertion (max-heap)
Keeping heap properties in mind, there are two conditions to be fulfilled when adding a new element to heap.
- After adding the new element, heap should still be a
complete binary tree
- After inserting new element, it should satisfy the condition that either all descendants of a node are
smaller
orgreater
than current value of that node (depending on the type of heap max or min respectively).
There are two steps for adding new element to heap.
-
As we have seen previously, when a
complete binary tree
is represented by an array then all the elements on theleft
will have consecutivenon-null
values, it means adding a new element at theend
of the array will still keep thetree
incomplete binary tree
shape. So, in step one, we add the new element to theend
of the array. -
After adding the element, traverse the tree backward to the root from new element and swap the values if the parent value is less than current value of the node in case of
max-heap
and vice versa formin-heap
. Stop swapping values if there are no values to swap or when you reach the root of the tree.
We will add a new element (
20
) to the heap in the above example
Step 1 - adding new element to heap
Changes are highlighted red. In first step, we added 20
to the end of the array i.e. in the start of next level in binary tree.
Step 2 - adjusting the tree to heap
In step 2, we will traverse the tree back to root from newly inserted element and will swap value as required i.e. pusing smaller values down to tree in case of max-heap
- Repeat the step, compare
20
to parent11
- Repeat the step, compare
20
to parent13
That’s it. At this point, 20
is at the root of tree and no more swaps are required.
Verify now, the node with maximum value is at the root of the tree (max-heap) and the tree is also a
complete binary tree
.
Deleting element from heap
In a heap, only root element can be removed and based on the type of heap, maximum or minimum value element is deleted.
As with insertion, keep in mind that after deletion the heap properties should remain intact i.e. the binary tree should be complete and every node will have greater value than all of its descendants in case of max heap and smaller value for min heap.
The deletion can be performed in two steps
-
Get the value of root node and then move the last element of the array to root. This will make sure that
complete binary tree
property of the heap will remain intact. -
From the root element, traverse the tree to the bottom. At every step, compare the child node values and pick the greater value, if the value picked is greater than current node value (in case of max heap) the swap it. Repeat the steps until there are not values to swap.
Heap deletion example
From our current example of heap, we will remove the newly inserted value 20
Step 1 - replace the root with last element of array
Step 2 - traverse the tree from top to bottom to bring greater values to the top (max-heap)
- Compare child nodes of
1
to pick the greater value and replace1
with it (13
in this case). Formin-heap
, pick the minimum value
- Repeat the steps
- In next step,
3
and0
are compared and3
is picked but11
is already greater than3
so we stop swapping here.
Now, verify the heap properties
Time complexity of heap insertion and deletion
As heap is represented by a complete binary tree and max height of a complete
binary tree will always be log (n)
, insertion and deletion from heap will take log (n)
time.
For example, in case of max-heap
insertion, if an element with maximum
value is inserted at the bottom of tree then it will take log (n)
(equal to height of tree) swaps to move it to the root. Same can be concluded for deletion, if an element is removed from the top then maximum log (n)
swaps can be made before tree is adjusted to heap.
In best case, if an element is removed from max-heap
and next element is the 2nd largest then we don’t need to make any swaps, in this case
time complexity will be O(1)
. Same conclusion can be derived from min-heap
Operation | Best Case | Worst Case |
---|---|---|
Insertion | O(1) | log(n) |
Max/Min value | O(1) | log(n) |
Implementation of heap in kotlin
- click
+
to see full implementation
import java.util.LinkedList
//sampleStart
fun main() {
fun addItemsToHeap(heap: Heap<Int>) {
heap.apply {
insert(10)
insert(20)
insert(40)
insert(3)
insert(100)
}
}
val maxHeap = Heap<Int>(Comparator { o1: Int, o2: Int ->
o1 - o2 // normal order, if o1 is smaller, return -ive
})
addItemsToHeap(maxHeap)
println("maxHeap: $maxHeap")
println("delete from maxHeap: ${maxHeap.delete()}")
println("after deletion: $maxHeap")
val minHeap = Heap<Int>(Comparator { o1: Int, o2: Int ->
o2 - o1 // reverse order, if o2 is smaller, return -ive
})
println()
addItemsToHeap(minHeap)
println("minHeap: $minHeap")
println("delete from minHeap: ${minHeap.delete()}")
println("after deletion: $minHeap")
}
//sampleEnd
/**
* Heap data structure, based on the comparator it can be min or max heap
*/
class Heap<T>(private val comparator: Comparator<T>) {
private val tree = LinkedList<T>()
fun insert(value: T) {
// add new value to the end of tree list
tree.add(value)
// current index of the element from where we start traversing backwards
// element was inserted at the end so this is last element index
var currentIndex = tree.size.dec()
// parent of current index node
var parentIndex = getParentIndex(currentIndex)
// loop until we reach the root node, parent of root is -1
while (parentIndex != -1) {
// parent value
val parent = tree[parentIndex]
// current value
val current = tree[currentIndex]
val comparison = comparator.compare(parent, current)
// if parent is less than current value, swap!
if (comparison < 0) {
swap(parentIndex, currentIndex)
} else {
// otherwise, no swaps are required. exit the loop
break
}
// set currentIndex to parent index
currentIndex = parentIndex
// set parent index of current index
parentIndex = getParentIndex(currentIndex)
}
}
/**
* Delete the root node (max or min element in the heap) and return it.
*/
fun delete(): T? {
if (tree.isEmpty()) {
return null // nothing to delete
}
val root = tree[getRootIndex()]
val lastElement = tree.removeLast()
if (tree.isEmpty()) {
return root
}
tree[getRootIndex()] = lastElement
// pull greater value to top (this can be minimum value based on comparator)
// starting node
var startParentIndex = getRootIndex()
// left child index of starting node
var leftChildIndex = getLeftIndex(startParentIndex)
// right child index of starting node
var rightChildIndex = getRightIndex(startParentIndex)
while (true) {
val leftChild = tree.getOrNull(leftChildIndex)
val rightChild = tree.getOrNull(rightChildIndex)
var newParentIndex = -1
// if there are no more nodes to explore, quit!
if (leftChild == null && rightChild == null) {
break
}
// if one of child nodes is null, then it will be right node b/c
// heap is represented by complete binary tree. and it will never happen
// in a complete binary tree that left node is null while there is a right node.
if (rightChild == null) {
if (comparator.compare(tree[startParentIndex], leftChild) < 0) {
// set the parent index for next iteration and swap values
newParentIndex = leftChildIndex
swap(startParentIndex, leftChildIndex)
}
} else {
// both nodes are present, compare them and pick the index for greater value node
val greaterChildNodeIndex = if (comparator.compare(leftChild, rightChild) < 0) {
rightChildIndex
} else {
leftChildIndex
}
// compare greater value node with current start node and swap values if
// child node is greater than parent
if (comparator.compare(tree[startParentIndex], tree[greaterChildNodeIndex]) < 0) {
// set the parent index for next iteration and swap values
newParentIndex = greaterChildNodeIndex
swap(startParentIndex, greaterChildNodeIndex)
}
}
if (newParentIndex != -1) {
// new parent index was set it means there was a swap
// continue with child nodes
startParentIndex = newParentIndex
leftChildIndex = getLeftIndex(startParentIndex)
rightChildIndex = getRightIndex(startParentIndex)
} else {
// there was no parent index set which means no swap happened.
// break the loop
break
}
}
return root
}
/**
* Return the top most element of heap - this is the element that will be deleted
* if delete is invoked
*/
fun peek(): T? = tree.getOrNull(getRootIndex())
/**
* @param node tree node
* @return parent index of the given node
*/
private fun getParentIndex(node: Int): Int = node.minus(1).floorDiv(2)
/**
* @param node tree node
* @return left child index of the given node
*/
private fun getLeftIndex(node: Int): Int = node.times(2).plus(1)
/**
* @param node tree node
* @return right child index of the given node
*/
private fun getRightIndex(node: Int): Int = node.times(2).plus(2)
/**
* Root will always be at index 1
*/
private fun getRootIndex(): Int = 0
private fun swap(indexA: Int, indexB: Int) {
val tmp = tree[indexA]
tree[indexA] = tree[indexB]
tree[indexB] = tmp
}
override fun toString(): String {
return tree.toString()
}
}
Priority queue (PriorityQueue)
PriorityQueue<E>
in java/kotlin is the implementation of priority heap. By default, the queued elements are ordered according to their natural ordering but a comparator
can be passed to constrcutor to reverse the sorting order.
import java.util.*
fun main() {
val minHeap = PriorityQueue<Int>() // min heap
minHeap.add(30)
minHeap.add(10)
println("min value in heap: ${minHeap.poll()}") // 10
val maxHeap = PriorityQueue<Int>(Collections.reverseOrder()) // max heap
maxHeap.add(30)
maxHeap.add(10)
println("max value in heap: ${maxHeap.poll()}") // 30
}
Heap sort
In heap sort, all values are stored in a heap and removed one by one. Depending the type of heap (min/max) the removed elements
will be in sorted order. For max
heap, the elements will be sorted in descending order as at any point in time the heap root will contain
the maximum value as its root. Similarly, for min
heap, the elements will be sorted in natural order.
- Ascending order example
import java.util.*
fun main() {
val sortedList = mutableListOf<Int>()
val minHeap = PriorityQueue<Int>()
minHeap.add(30)
minHeap.add(10)
minHeap.add(40)
while (minHeap.isNotEmpty()) {
sortedList.add(minHeap.poll())
}
println(sortedList) // [10, 30, 40]
}
- Descending order example
import java.util.*
fun main() {
val sortedList = mutableListOf<Int>()
val maxHeap = PriorityQueue<Int>(Collections.reverseOrder())
maxHeap.add(30)
maxHeap.add(10)
maxHeap.add(40)
while (maxHeap.isNotEmpty()) {
sortedList.add(maxHeap.poll())
}
println(sortedList) // [40, 30, 10]
}
Time complexity
Heap sort runs in nlog(n)
time. For a heap of n
elments, insertion will take nlog(n)
time while remove the items will also
take nlog(n)
time. So, the total time for heap sort is equal to nlog(n) + nlog(n) = 2nlog(n)
which is O(nlogn)
Heapify
Heapify is the process of creating a heap data structure from a binary tree represented using an array. An existing binary tree can be converted into heap using following steps
- Start traversing the tree from bottom up.
- For each node, compare children and swap the greater value with parent in case of
max
heap andmin
value in case ofmin
heap - If a value was swapped, pick the child from which the value went to parent and repeat the child comparison steps for it until there are no swaps or children left
- Stop the process when root of the tree is reached