Before learning about a hash table, it helps to first see a problem that programmers run into all the time. Once you feel the pain of this problem, you will understand exactly why hash tables were invented.
When we write programs, we often need to connect two pieces of information together. For example, think about a classroom. Every student has:
Ram or Neha — this is a piece of text (a *string*).14551 — this is a whole number (an *integer*).We want to link each name to its roll number:
Ram → 14551Neha → 14555Prakhar → 14567Shikhar → 14564This kind of "this goes with that" connection is called a mapping. The name is the thing we look up — we call it the key. The roll number is the answer we get back — we call it the value.
So the real question is: *how do we store these key–value pairs in our program?*
An *array* is just a row of boxes that hold data, where each box has a number called an index (0, 1, 2, 3, and so on).
One natural idea is to use two arrays side by side:
names array that holds the strings.roll num array that holds the integers.We line them up so that the same index in both arrays belongs to the same student:
index: 0 1 2 3
names: Ram Neha Prakhar Shikhar
roll num: 14551 14555 14567 14564
Because Shikhar sits at index 3 in names, his roll number also sits at index 3 in roll num, which is 14564. The matching index is the glue that holds the pair together.
This is easy to set up. But the real test of any storage idea is: *how easily can we get our data back out?*
Suppose someone asks: "What is Shikhar's roll number?"
We know his name, but we do not know his index. So the computer has no choice but to start at the beginning and check the names one by one:
0 → Ram. Not Shikhar. Keep going.1 → Neha. Not Shikhar. Keep going.2 → Prakhar. Not Shikhar. Keep going.3 → Shikhar. Found him! His index is 3.roll num[3] → the answer is 14564.Checking every box from the start until you find the right one is called a linear scan (or *linear search*). It works, but notice the problem: to find one student, we may have to look at *all* of them.
With only 4 students this feels fine. But imagine the roll numbers of every student, in every class, in every school of a whole city — millions of names. Searching one by one through millions of boxes every single time is painfully slow. This is the heart of the problem.
Storing the mapping in two arrays is intuitive, but it has two serious limitations:
N items, the work grows in step with N. Double the students, and the search can take twice as long.We could fix the second problem by using a dynamic array (one that can grow as needed). But that does *nothing* for the first problem — searching by name would still be a slow linear scan.
So we are left wishing for something better. What we really want is a data structure that lets us jump straight to a student's roll number from their name — instantly, without scanning — and that can also grow as more data arrives.
That magical-sounding data structure exists. It is called a hash table, and solving this exact problem is what it was built for.