Before learning what a *heap* is, it helps to first feel the problem that a heap was invented to solve. Once you see the pain, the solution makes a lot more sense.
Sometimes a program needs a data store (a place to keep data) that is smart in one specific way: at every moment, it should know the smallest or largest item it is holding. And it should do two things very fast:
If both of these are quick even when there are millions of items, life is good.
Imagine an emergency ward with only a few doctors but many patients. Each patient is given a priority number based on how serious their case is. A bigger number means a more urgent patient.
In our example the waiting patients have priorities:
15, 10, 22, 12, 50, 55
Here is the rule:
So when doc1 becomes free, the patient with priority 55 should be picked. Then when another doctor is free, 50 is next, and so on. If a new patient with priority 60 walks in, they should jump ahead of everyone, because 60 is now the largest.
This "always serve the most urgent first" idea is called a priority queue. It is a queue where position depends on priority, not on who arrived first.
A natural first idea is a linked list. A linked list is a chain of boxes called *nodes*. Each node holds one patient's data (name and priority) and a next pointer that points to the following node. The very first node is reached through a pointer called head, and the last node's next points to null (meaning "end of the chain").
So our patients become:
head -> 15 -> 10 -> 22 -> 12 -> 50 -> 55 -> null
When a new patient with priority 60 arrives, we do not need to find a special spot. We just stick the new node at the front (the head) of the list and point it to the old head:
head -> 60 -> 15 -> 10 -> 22 -> 12 -> 50 -> 55 -> null
This takes the same tiny amount of time no matter how long the list is. We call this a constant-time operation (often written as O(1)) — the cost does not grow with the number of patients. So far, so good.
Now a doctor becomes free and we must pull out the patient with the highest priority. The problem is that the list is not sorted. The biggest value could be anywhere.
So we have to walk through the whole chain, one node at a time, keeping track of the biggest value seen so far. Step by step, using a current pointer and a max variable:
15. So far max = 15.10. Still max = 15.22. Now max = 22.12? Still max = 22.50? Now max = 50.55? Now max = 55.null. The walk is over, max = 55.Only after touching every single node do we know the answer is 55. Then we remove that node. This full walk through the list is called a linear scan, and its cost grows with the number of patients (written as O(n), where n is how many items there are).
For a handful of patients, a linear scan is fine. But the operation we do most often — find the max and take it out — is also the slow one. Now scale the idea up. Picture a country-wide emergency service like 911, with hundreds of thousands of active cases. Scanning the whole list every single time a responder frees up would be painfully slow.
The textbook sums up two clear weaknesses of the linked-list approach:
next (and sometimes previous) pointers just to stay linked together.We want a data structure that gives us the best of both worlds:
A plain linked list gives us fast insert but slow extract. Keeping the list fully sorted would give fast extract but slow insert. Neither is great. What we really want is a structure that stays *just organized enough* to always know its top item, without paying to keep everything perfectly sorted.
That "magical" structure exists. It is called a heap, and it is exactly what the rest of this course will build. For now, the key takeaway is the problem itself: we need fast inserts and fast access to the largest (or smallest) element at the same time.