Top-K Elements problems are a common category in coding interviews where you need to find the ‘k’ largest or smallest elements from a given collection. Heaps (Priority Queues) are often the most efficient data structure for solving these types of problems.

Core Idea:

The central idea is to maintain a heap of size ‘k’ that stores the ‘k’ elements encountered so far that satisfy the problem’s criteria (e.g., k largest, k smallest, k most frequent).

  1. Min-Heap for Top K Largest: To find the K largest elements, use a Min-Heap.
    • Iterate through the input elements.
    • If the heap size is less than ‘k’, add the current element to the heap.
    • If the heap size is ‘k’ and the current element is larger than the heap’s minimum element (the root of the Min-Heap), remove the root and insert the current element.
    • After iterating through all elements, the Min-Heap will contain the K largest elements.
  2. Max-Heap for Top K Smallest: To find the K smallest elements, use a Max-Heap.
    • Iterate through the input elements.
    • If the heap size is less than ‘k’, add the current element to the heap.
    • If the heap size is ‘k’ and the current element is smaller than the heap’s maximum element (the root of the Max-Heap), remove the root and insert the current element.
    • After iterating through all elements, the Max-Heap will contain the K smallest elements.

Common Scenarios & How it Works:

  1. Kth Largest/Smallest Element in an Array (e.g., LeetCode #215):
    • For the Kth largest, build a Min-Heap of size K. The root will be the Kth largest.
    • For the Kth smallest, build a Max-Heap of size K. The root will be the Kth smallest.
  2. Top K Frequent Elements (e.g., LeetCode #347):
    • First, count the frequency of each element using a hash map.
    • Then, use a Min-Heap to store pairs of (frequency, element), ordered by frequency. Keep the heap size up to K.
  3. Sort Characters By Frequency (e.g., LeetCode #451):
    • Count character frequencies.
    • Use a Max-Heap to store (frequency, character) pairs, ordered by frequency (descending).
  4. K Closest Points to Origin (e.g., LeetCode #973):
    • Calculate the distance of each point from the origin.
    • Use a Max-Heap to store (distance, point) pairs. Keep the K points with the smallest distances. If a new point’s distance is smaller than the max distance in the heap, replace the max.
  5. Find K Pairs with Smallest Sums (e.g., LeetCode #373):
    • This often involves iterating through possible pairs and using a Max-Heap to keep track of the K pairs with the smallest sums encountered so far.
  6. Merge K Sorted Lists (e.g., LeetCode #23):
    • Use a Min-Heap to store one element from each of the K lists. The heap will be ordered by the element values.
    • Repeatedly extract the minimum element from the heap (which is the overall minimum among the current heads of the lists), add it to the result, and add the next element from the same list to the heap.

Advantages:

  • Efficiency: For a collection of N elements, finding Top-K elements using a heap typically takes O(N log K) time, which is more efficient than sorting the entire collection (O(N log N)) if K is much smaller than N.
  • Space Efficiency: Requires O(K) space to store the elements in the heap.
  • Streaming Data: Well-suited for scenarios where elements arrive one by one (streaming data), as the heap can be updated dynamically.

When to Use:

  • When you need to find the ‘k’ largest or smallest elements from a collection without needing to sort the entire collection.
  • Problems involving finding the ‘k’ most frequent or ‘k’ least frequent items.
  • When dealing with ‘k’ closest or ‘k’ furthest items based on some metric.
  • Merging multiple sorted streams or lists.
  • When the input size is very large, and sorting it entirely is impractical.

Example LeetCode Problems:

  1. (#215) Kth Largest Element in an Array
  2. (#347) Top K Frequent Elements
  3. (#973) K Closest Points to Origin
  4. (#23) Merge k Sorted Lists
  5. (#703) Kth Largest Element in a Stream
  6. (#692) Top K Frequent Words
  7. (#451) Sort Characters By Frequency
  8. (#373) Find K Pairs with Smallest Sums
  9. (#1046) Last Stone Weight (Uses a Max-Heap to simulate the process)
  10. (#295) Find Median from Data Stream (Uses two heaps: a Max-Heap for the lower half and a Min-Heap for the upper half)