Hash Table

A hash table is a data structure that implements an associative array by using a hash function to compute indices for storing and retrieving key-value pairs with optimal average-case time complexity.

Hash Table

A hash table (also known as a hash map) is a fundamental data structure that enables efficient storage and retrieval of key-value pairs. It works by transforming keys into array indices using a hash function, achieving constant-time O(1) average case performance for insertions, deletions, and lookups.

Core Concepts

Hash Function

The heart of a hash table is its hash function, which:

  • Converts keys of any size into fixed-size integers
  • Distributes values uniformly across the available space
  • Generates the same output consistently for identical inputs
  • Minimizes collision between different inputs

Collision Resolution

When multiple keys hash to the same index, collision resolution becomes necessary. Common strategies include:

  1. Chaining: Each array slot contains a linked list of elements
  2. Open Addressing: Probing for next available slot using methods like:
    • Linear probing
    • Quadratic probing
    • Double hashing

Performance Characteristics

Time Complexity

  • Average case: O(1) for insert, delete, search
  • Worst case: O(n) when many collisions occur
  • Space Complexity: O(n) where n is the number of stored items

Load Factor

The load factor (α) = n/m, where:

  • n = number of stored elements
  • m = size of the hash table
  • Typically maintained below 0.7 for optimal performance

Applications

Hash tables are extensively used in:

Implementation Considerations

Resizing

  • Dynamic Array is needed when load factor exceeds threshold
  • Usually doubles table size when capacity is reached
  • Requires rehashing all existing elements

Key Requirements

Keys must be:

  • Immutable while in the hash table
  • Implement consistent equals and hashCode methods
  • Distribute well across the hash space

Common Operations

INSERT(key, value):
    index = hash(key) % tableSize
    store (key,value) at index (handling collisions)

SEARCH(key):
    index = hash(key) % tableSize
    return value at index (following collision chain if necessary)

DELETE(key):
    index = hash(key) % tableSize
    remove (key,value) at index (handling collision chain)

Trade-offs

Advantages

  • Constant-time average case operations
  • Flexible key types
  • Space-efficient for sparse data

Disadvantages

  • No ordering of keys
  • Performance degrades with poor hash functions
  • Extra space needed for collision handling

Hash tables represent a crucial Algorithm tool in computer science, balancing theoretical elegance with practical utility in countless applications.