Hash in Java
You can override the hashCode()
method for your class. By default, this is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the Java. There are two basic rules when implementing hashCode()
:
- You must override
hashCode
in every class that overridesequals
. - Equal objects must have equal hash codes.
Collections such as HashMap and HashSet rely on the hash code of an object. The following is the worst possible implementation:
public class Book {
private String title;
private double price;
...
@Override // never do this
public int hashCode() {
return 42;
}
}
In this case, every object hashes to the same bucket, and hash tables degenerate to linked lists.
Hash code
Java has already provided built-in feasible implementations for common data types, and it is recommended using the corresponding static methods. For example, Float.hashCode(3.14f)
.
Furthermore, if all fields in your class have their well-defined hash code implementations, then you can simply use the hash()
method from Objects
, which generates a hash code for a sequence of input values.
@Override
public int hashCode() {
return Objects.hash(title, price);
}
To design an array index, we shall also mark off the sign bit to turn the 32-bit number into a 31-bit non-negative integer. Take the modular hashing for instance:
private int hash(Object x) {
return (x.hashCode() & 0x7fffffff) % m;
}
The implementation in JDK
private int hash(Object x) {
int h = x.hashCode();
h ^= (h >>> 16);
return h & (m - 1);
}
The above is how HashMap
computes the hash in JDK. Suppose m is the power of 2, then modular operation can be replaced by a bit shift operation (h & (m - 1) == h % m
).
A good hash function
In summary, we have three primary requirements in implementing a good hash function for a given data type:
- It should be consistent: equal keys must produce the same hash value
- It should be efficient to compute
- It should uniformly distribute the keys
Satisfying these requirements simultaneously is a job for experts. Java Programmers can assume hashCode()
does the job.