java如何生成hashCode
JDK8 Object基类的hashCode()方法的源码如下。hashCode方法是一个本地native方法,这里主要看它给的注释。
/**
* Returns a hash code value for the object. This method is
* supported for the benefit of hash tables such as those provided by
* {@link java.util.HashMap}.
* <p>
* The general contract of {@code hashCode} is:
* <ul>
* <li>Whenever it is invoked on the same object more than once during
* an execution of a Java application, the {@code hashCode} method
* must consistently return the same integer, provided no information
* used in {@code equals} comparisons on the object is modified.
* This integer need not remain consistent from one execution of an
* application to another execution of the same application.
* <li>If two objects are equal according to the {@code equals(Object)}
* method, then calling the {@code hashCode} method on each of
* the two objects must produce the same integer result.
* <li>It is <em>not</em> required that if two objects are unequal
* according to the {@link java.lang.Object#equals(java.lang.Object)}
* method, then calling the {@code hashCode} method on each of the
* two objects must produce distinct integer results. However, the
* programmer should be aware that producing distinct integer results
* for unequal objects may improve the performance of hash tables.
* </ul>
* <p>
* As much as is reasonably practical, the hashCode method defined by
* class {@code Object} does return distinct integers for distinct
* objects. (This is typically implemented by converting the internal
* address of the object into an integer, but this implementation
* technique is not required by the
* Java™ programming language.)
*
* @return a hash code value for this object.
* @see java.lang.Object#equals(java.lang.Object)
* @see java.lang.System#identityHashCode
*/
public native int hashCode();
hashCode()方法依据对象的内存地址为对象返回一个哈希值,这个哈希值会在一些哈希表中使用,比如HashMap
,HashTable
, HashSet
。 并且这个方法在Object基类中定义。
注释中对这个hashCode()这个方法提出了如下几个 要求(contract):
-
在同一个java进程中,对同一个对象多次调用这个方法得到的返回值必须一样。
-
如果两个对象通过equals() 比较后相等,那么他们的hashCode()返回必须一样。
-
如果两个对象equals() 比较不相等,他们的hashCode()返回不需要不一样。
提出几个问题
既然hashCode返回的值与对象的内存的地址有关,那在JVM中发生垃圾回收处理(GC)后,对象的内存地址必然发生了变化,那hashCode的返回值是否会改变?
不会。
第一次调用hashCode()方法,会根据对象的内存地址计算得到一个int类型数值,并保存在对象头中。之后再调用,直接取用这个值。当然这只对没有被我们重写的hashCode方法。重写过的hashCode每次调用都会进行计算,这也要求我们重写hashCode()方法时注意返回的值不能随意改变。
下图截取自 周志明的《深入理解java虚拟机》
为什么equals和hashCode方法要求一起重写
主要是由于 hashMap等集合类中对hashCode方法和equals方法的使用,如果不规范随意的重写这两个方法可能会导致hashMap集合类不能正常使用,重写hashCode()和equals()方法要严格遵守源码注释中提到的 Contract。
从openjdk中看hashCode实现
由于Oracle jdk的开源协议问题,由c/c++实现的本地方法代码文件被压解成dll库文件,我们不能直接查看,选择下载 openjdk查看
hashCode()方法在openjdk中的实现应该是synchronizer.cpp文件,路径是 hotspot/src/share/vm/runtime/synchronizer.cpp
贴出一部分代码:
static inline intptr_t get_next_hash(Thread * Self, oop obj) {
intptr_t value = 0 ;
if (hashCode == 0) {
// This form uses an unguarded global Park-Miller RNG,
// so it‘s possible for two threads to race and generate the same RNG.
// On MP system we‘ll have lots of RW access to a global, so the
// mechanism induces lots of coherency traffic.
value = os::random() ;
} else
if (hashCode == 1) {
// This variation has the property of being stable (idempotent)
// between STW operations. This can be useful in some of the 1-0
// synchronization schemes.
intptr_t addrBits = cast_from_oop<intptr_t>(obj) >> 3 ;
value = addrBits ^ (addrBits >> 5) ^ GVars.stwRandom ;
} else
if (hashCode == 2) {
value = 1 ; // for sensitivity testing
} else
if (hashCode == 3) {
value = ++GVars.hcSequence ;
} else
if (hashCode == 4) {
value = cast_from_oop<intptr_t>(obj) ;
} else {
// Marsaglia‘s xor-shift scheme with thread-specific state
// This is probably the best overall implementation -- we‘ll
// likely make this the default in future releases.
unsigned t = Self->_hashStateX ;
t ^= (t << 11) ;
Self->_hashStateX = Self->_hashStateY ;
Self->_hashStateY = Self->_hashStateZ ;
Self->_hashStateZ = Self->_hashStateW ;
unsigned v = Self->_hashStateW ;
v = (v ^ (v >> 19)) ^ (t ^ (t >> 8)) ;
Self->_hashStateW = v ;
value = v ;
}
?
value &= markOopDesc::hash_mask;
if (value == 0) value = 0xBAD ;
assert (value != markOopDesc::no_hash, "invariant") ;
TEVENT (hashCode: GENERATE) ;
return value;
}
我们在运行一个java进程时,可以通过-XX:hashCode=
参数来指定hashCode方法返回值的生成策略,默认是最后一种情况。一般情况下我们使用默认的生成策略,有兴趣的同学可以自行下载 openjdk的源码查看。
进入openjdk官网