Open Addressing
A hash table collision resolution technique where elements that hash to an occupied slot are placed in alternative locations through systematic probing.
Open Addressing
Open addressing is a fundamental collision resolution method used in hash tables where, upon encountering a collision, the algorithm searches for the next available empty slot in the table according to a predetermined probing sequence.
Core Principles
The essential characteristic of open addressing is that all elements are stored directly within the hash table array, unlike chaining approaches that use external data structures. This leads to:
- Better memory locality
- More efficient cache utilization
- Simpler memory management
- No additional pointer overhead
Probing Strategies
Linear Probing
The simplest form of open addressing, where the probe sequence is:
h(k,i) = (h'(k) + i) mod m
where:
- h'(k) is the initial hash value
- i is the probe number
- m is the table size
While simple to implement, linear probing suffers from primary clustering, where occupied slots tend to form contiguous blocks.
Quadratic Probing
Uses a quadratic function for the probe sequence:
h(k,i) = (h'(k) + c₁i + c₂i²) mod m
This reduces clustering but may not probe all table positions.
Double Hashing
Uses a secondary hash function to determine the probe step:
h(k,i) = (h₁(k) + i·h₂(k)) mod m
Provides better distribution than linear or quadratic probing.
Performance Characteristics
The performance of open addressing heavily depends on the load factor α:
- Search time: O(1/(1-α)) average case
- Insertion: O(1/(1-α)) average case
- Deletion: O(1/(1-α)) average case
Implementation Considerations
Deletion
Deletion in open addressing requires special handling to maintain the probe sequences. Common approaches include:
- Tombstone - marking slots as deleted rather than empty
- Immediate reorganization of affected elements
Load Factor Management
To maintain efficiency:
- Keep load factor below 0.7
- Dynamic resizing when load factor exceeds threshold
- Implement proper growth strategies
Advantages and Disadvantages
Advantages
- Better cache performance
- No extra memory for linking structures
- Predictable memory usage
Disadvantages
- Performance degrades significantly at high load factors
- Complex deletion handling
- Susceptible to clustering
Applications
Open addressing is particularly well-suited for:
- Memory-constrained environments
- Systems requiring predictable memory usage
- Cache-friendly applications
- Embedded systems where memory allocation must be controlled
Related Techniques
- Robin Hood Hashing - A variation that minimizes probe sequence length variance
- Cuckoo Hashing - A more sophisticated approach using multiple hash functions
- Linear Probing - The fundamental probing technique
The choice between open addressing and alternative collision resolution methods often depends on specific application requirements, including memory constraints, expected load factors, and performance characteristics under the intended usage patterns.