Heap data structure, heap sort , heapify and priority queues

Binary Tree

Binary tree is a type of tree where each node can have at most 2 children.

Perfect or Full binary tree

Full binary tree is a type of binary tree where each parent node has either two or no children

Or

A full binary tree contains the maximum number of nodes for the height of that tree i.e. adding another node will increase the height of the tree

Keeping in mind that the height of a binary tree is equal to 2h+1-1, a complete binary with height h=3 should contain 7 nodes. (see example B below)

In the following examples, tree A & Bare full binary trees but tree C & D are not (find out why?)

Complete binary tree

A complete binary tree is a binary tree in which every level, except possibly the last, is completely filled, and all nodes in the last level are as far left as possible - wikipedia

A complete binary tree needs to satisfy these two properties

  1. A complete binary tree is a full binary tree up to the level h-1
  2. On last level it can have between 1 to 2h nodes but nodes are filled from left to right.

In the following examples A & B are complete binary trees because they satisfy the above two properties.

C & D are not complete binary trees because C does not satisfy the 2nd property while D doesn’t satisfy any of the above-mentioned properties.

Height of a complete binary tree will always be minimum that is log n

Array representation of binary tree

If a binary tree is represented by array then for any node present at index i, it’s left child will be at (i * 2) + 1, right will be at (i * 2) + 2 and prent will be at ⌊(i - 1) / 2⌋.

    node = i
    left-node = (i * 2) + 1
    right-node = (i * 2) + 2
    parent-node = ⌊(i - 1) / 2⌋ // floor value

Example 1.

When a complete binary tree is represented by an array (as in this example) then it will always have consecutive non-null values on the left side of the array but it can have nulls at the end of array

Example 2.

Note that even the left nodes of 2 and 7 are not in the tree but still there should be an index reserved for these nodes. For example, if 3 (right child of 2) is stored at index 4 then it will become left child of node 7 which is wrong.

Heap

Heap is a complete binary tree in which elements can be arranged in two ways

  1. If elements are arranged in such a way that every node has a value greater than all of its descendants then it’s called max-heap

    In max heap, root element will always have the maximum value

  2. If elements are arranged in such a way that every node has a value smaller than all of its descendants then it’s called min-heap

    In min heap, root element will always have the smallest value

  • Max-heap example
  • Min-heap example

Heap representation in code

Heap can be best implemented using array representation of binary tree which we have covered in this article. For the rest of this article, we will be using array representation of binary tree to represent, insert, delete items in heap (see example below).

Heap insertion (max-heap)

Keeping heap properties in mind, there are two conditions to be fulfilled when adding a new element to heap.

  1. After adding the new element, heap should still be a complete binary tree
  2. After inserting new element, it should satisfy the condition that either all descendants of a node are smaller or greater than current value of that node (depending on the type of heap max or min respectively).

There are two steps for adding new element to heap.

  1. As we have seen previously, when a complete binary tree is represented by an array then all the elements on the left will have consecutive non-null values, it means adding a new element at the end of the array will still keep the tree in complete binary tree shape. So, in step one, we add the new element to the end of the array.

  2. After adding the element, traverse the tree backward to the root from new element and swap the values if the parent value is less than current value of the node in case of max-heap and vice versa for min-heap. Stop swapping values if there are no values to swap or when you reach the root of the tree.

We will add a new element (20) to the heap in the above example

Step 1 - adding new element to heap

Changes are highlighted red. In first step, we added 20 to the end of the array i.e. in the start of next level in binary tree.

Step 2 - adjusting the tree to heap

In step 2, we will traverse the tree back to root from newly inserted element and will swap value as required i.e. pusing smaller values down to tree in case of max-heap

  • Repeat the step, compare 20 to parent 11
  • Repeat the step, compare 20 to parent 13

That’s it. At this point, 20 is at the root of tree and no more swaps are required.

Verify now, the node with maximum value is at the root of the tree (max-heap) and the tree is also a complete binary tree.

Deleting element from heap

In a heap, only root element can be removed and based on the type of heap, maximum or minimum value element is deleted.

As with insertion, keep in mind that after deletion the heap properties should remain intact i.e. the binary tree should be complete and every node will have greater value than all of its descendants in case of max heap and smaller value for min heap.

The deletion can be performed in two steps

  1. Get the value of root node and then move the last element of the array to root. This will make sure that complete binary tree property of the heap will remain intact.

  2. From the root element, traverse the tree to the bottom. At every step, compare the child node values and pick the greater value, if the value picked is greater than current node value (in case of max heap) the swap it. Repeat the steps until there are not values to swap.

Heap deletion example

From our current example of heap, we will remove the newly inserted value 20

Step 1 - replace the root with last element of array

Step 2 - traverse the tree from top to bottom to bring greater values to the top (max-heap)

  • Compare child nodes of 1 to pick the greater value and replace 1 with it (13 in this case). For min-heap, pick the minimum value
  • Repeat the steps
  • In next step, 3 and 0 are compared and 3 is picked but 11 is already greater than 3 so we stop swapping here.

Now, verify the heap properties

Time complexity of heap insertion and deletion

As heap is represented by a complete binary tree and max height of a complete binary tree will always be log (n), insertion and deletion from heap will take log (n) time.

For example, in case of max-heap insertion, if an element with maximum value is inserted at the bottom of tree then it will take log (n) (equal to height of tree) swaps to move it to the root. Same can be concluded for deletion, if an element is removed from the top then maximum log (n) swaps can be made before tree is adjusted to heap.

In best case, if an element is removed from max-heap and next element is the 2nd largest then we don’t need to make any swaps, in this case time complexity will be O(1). Same conclusion can be derived from min-heap

Operation Best Case Worst Case
Insertion O(1) log(n)
Max/Min value O(1) log(n)

Implementation of heap in kotlin

  • click + to see full implementation
import java.util.LinkedList

 //sampleStart
fun main() {

    fun addItemsToHeap(heap: Heap<Int>) {
        heap.apply {
            insert(10)
            insert(20)
            insert(40)
            insert(3)
            insert(100)
        }
    }

    val maxHeap = Heap<Int>(Comparator { o1: Int, o2: Int ->
        o1 - o2 // normal order, if o1 is smaller, return -ive
    })

    addItemsToHeap(maxHeap)
    println("maxHeap: $maxHeap")
    println("delete from maxHeap: ${maxHeap.delete()}")
    println("after deletion: $maxHeap")

    val minHeap = Heap<Int>(Comparator { o1: Int, o2: Int ->
        o2 - o1 // reverse order, if o2 is smaller, return -ive
    })

    println()
    addItemsToHeap(minHeap)
    println("minHeap: $minHeap")
    println("delete from minHeap: ${minHeap.delete()}")
    println("after deletion: $minHeap")
}
//sampleEnd



/**
 * Heap data structure, based on the comparator it can be min or max heap
 */
class Heap<T>(private val comparator: Comparator<T>) {
    private val tree = LinkedList<T>()

    fun insert(value: T) {
        // add new value to the end of tree list
        tree.add(value)

        // current index of the element from where we start traversing backwards
        // element was inserted at the end so this is last element index
        var currentIndex = tree.size.dec()

        // parent of current index node
        var parentIndex = getParentIndex(currentIndex)

        // loop until we reach the root node, parent of root is -1
        while (parentIndex != -1) {
            // parent value
            val parent = tree[parentIndex]
            // current value
            val current = tree[currentIndex]

            val comparison = comparator.compare(parent, current)

            // if parent is less than current value, swap!
            if (comparison < 0) {
                swap(parentIndex, currentIndex)
            } else {
                // otherwise, no swaps are required. exit the loop
                break
            }

            // set currentIndex to parent index
            currentIndex = parentIndex
            // set parent index of current index
            parentIndex = getParentIndex(currentIndex)
        }
    }

    /**
     * Delete the root node (max or min element in the heap) and return it.
     */
    fun delete(): T? {
        if (tree.isEmpty()) {
            return null // nothing to delete
        }
        val root = tree[getRootIndex()]

        val lastElement = tree.removeLast()

        if (tree.isEmpty()) {
            return root
        }

        tree[getRootIndex()] = lastElement

        // pull greater value to top (this can be minimum value based on comparator)

        // starting node
        var startParentIndex = getRootIndex()

        // left child index of starting node
        var leftChildIndex = getLeftIndex(startParentIndex)

        // right child index of starting node
        var rightChildIndex = getRightIndex(startParentIndex)

        while (true) {

            val leftChild = tree.getOrNull(leftChildIndex)
            val rightChild = tree.getOrNull(rightChildIndex)
            var newParentIndex = -1


            // if there are no more nodes to explore, quit!
            if (leftChild == null && rightChild == null) {
                break
            }

            // if one of child nodes is null, then it will be right node b/c
            // heap is represented by complete binary tree. and it will never happen
            // in a complete binary tree that left node is null while there is a right node.
            if (rightChild == null) {
                if (comparator.compare(tree[startParentIndex], leftChild) < 0) {
                    // set the parent index for next iteration and swap values
                    newParentIndex = leftChildIndex
                    swap(startParentIndex, leftChildIndex)
                }
            } else {
                // both nodes are present, compare them and pick the index for greater value node
                val greaterChildNodeIndex = if (comparator.compare(leftChild, rightChild) < 0) {
                    rightChildIndex
                } else {
                    leftChildIndex
                }

                // compare greater value node with current start node and swap values if
                // child node is greater than parent
                if (comparator.compare(tree[startParentIndex], tree[greaterChildNodeIndex]) < 0) {
                    // set the parent index for next iteration and swap values
                    newParentIndex = greaterChildNodeIndex
                    swap(startParentIndex, greaterChildNodeIndex)
                }
            }

            if (newParentIndex != -1) {
                // new parent index was set it means there was a swap
                // continue with child nodes
                startParentIndex = newParentIndex
                leftChildIndex = getLeftIndex(startParentIndex)
                rightChildIndex = getRightIndex(startParentIndex)
            } else {
                // there was no parent index set which means no swap happened.
                // break the loop
                break
            }
        }
        return root
    }

    /**
     * Return the top most element of heap - this is the element that will be deleted
     * if delete is invoked
     */
    fun peek(): T? = tree.getOrNull(getRootIndex())

    /**
     * @param node tree node
     * @return parent index of the given node
     */
    private fun getParentIndex(node: Int): Int = node.minus(1).floorDiv(2)

    /**
     * @param node tree node
     * @return left child index of the given node
     */
    private fun getLeftIndex(node: Int): Int = node.times(2).plus(1)

    /**
     * @param node tree node
     * @return right child index of the given node
     */
    private fun getRightIndex(node: Int): Int = node.times(2).plus(2)

    /**
     * Root will always be at index 1
     */
    private fun getRootIndex(): Int = 0

    private fun swap(indexA: Int, indexB: Int) {
        val tmp = tree[indexA]
        tree[indexA] = tree[indexB]
        tree[indexB] = tmp
    }

    override fun toString(): String {
        return tree.toString()
    }
}

Priority queue (PriorityQueue)

PriorityQueue<E> in java/kotlin is the implementation of priority heap. By default, the queued elements are ordered according to their natural ordering but a comparator can be passed to constrcutor to reverse the sorting order.

import java.util.*

fun main() {
    val minHeap = PriorityQueue<Int>() // min heap
    minHeap.add(30)
    minHeap.add(10)
    println("min value in heap: ${minHeap.poll()}") // 10

    val maxHeap = PriorityQueue<Int>(Collections.reverseOrder()) // max heap
    maxHeap.add(30)
    maxHeap.add(10)
    println("max value in heap: ${maxHeap.poll()}") // 30
}

Heap sort

In heap sort, all values are stored in a heap and removed one by one. Depending the type of heap (min/max) the removed elements will be in sorted order. For max heap, the elements will be sorted in descending order as at any point in time the heap root will contain the maximum value as its root. Similarly, for min heap, the elements will be sorted in natural order.

  • Ascending order example
import java.util.*

fun main() {
    val sortedList = mutableListOf<Int>()
    val minHeap = PriorityQueue<Int>()
    minHeap.add(30)
    minHeap.add(10)
    minHeap.add(40)
    while (minHeap.isNotEmpty()) {
        sortedList.add(minHeap.poll())
    }
    println(sortedList) // [10, 30, 40]
}
  • Descending order example
import java.util.*

fun main() {
    val sortedList = mutableListOf<Int>()
    val maxHeap = PriorityQueue<Int>(Collections.reverseOrder())
    maxHeap.add(30)
    maxHeap.add(10)
    maxHeap.add(40)
    while (maxHeap.isNotEmpty()) {
        sortedList.add(maxHeap.poll())
    }
    println(sortedList) // [40, 30, 10]
}

Time complexity

Heap sort runs in nlog(n) time. For a heap of n elments, insertion will take nlog(n) time while remove the items will also take nlog(n) time. So, the total time for heap sort is equal to nlog(n) + nlog(n) = 2nlog(n) which is O(nlogn)

Heapify

Heapify is the process of creating a heap data structure from a binary tree represented using an array. An existing binary tree can be converted into heap using following steps

  1. Start traversing the tree from bottom up.
  2. For each node, compare children and swap the greater value with parent in case of max heap and min value in case of min heap
  3. If a value was swapped, pick the child from which the value went to parent and repeat the child comparison steps for it until there are no swaps or children left
  4. Stop the process when root of the tree is reached

top