java中hashCode()源码小记

java如何生成hashCode

JDK8 Object基类的hashCode()方法的源码如下。hashCode方法是一个本地native方法,这里主要看它给的注释。

/**
* Returns a hash code value for the object. This method is
* supported for the benefit of hash tables such as those provided by
* {@link java.util.HashMap}.
* <p>
* The general contract of {@code hashCode} is:
* <ul>
* <li>Whenever it is invoked on the same object more than once during
*     an execution of a Java application, the {@code hashCode} method
*     must consistently return the same integer, provided no information
*     used in {@code equals} comparisons on the object is modified.
*     This integer need not remain consistent from one execution of an
*     application to another execution of the same application.
* <li>If two objects are equal according to the {@code equals(Object)}
*     method, then calling the {@code hashCode} method on each of
*     the two objects must produce the same integer result.
* <li>It is <em>not</em> required that if two objects are unequal
*     according to the {@link java.lang.Object#equals(java.lang.Object)}
*     method, then calling the {@code hashCode} method on each of the
*     two objects must produce distinct integer results. However, the
*     programmer should be aware that producing distinct integer results
*     for unequal objects may improve the performance of hash tables.
* </ul>
* <p>
* As much as is reasonably practical, the hashCode method defined by
* class {@code Object} does return distinct integers for distinct
* objects. (This is typically implemented by converting the internal
* address of the object into an integer, but this implementation
* technique is not required by the
* Java&trade; programming language.)
*
* @return a hash code value for this object.
* @see     java.lang.Object#equals(java.lang.Object)
* @see     java.lang.System#identityHashCode
*/
public native int hashCode();

hashCode()方法依据对象的内存地址为对象返回一个哈希值,这个哈希值会在一些哈希表中使用,比如HashMapHashTable, HashSet。 并且这个方法在Object基类中定义。

注释中对这个hashCode()这个方法提出了如下几个 要求(contract):

  1. 在同一个java进程中,对同一个对象多次调用这个方法得到的返回值必须一样。

  2. 如果两个对象通过equals() 比较后相等,那么他们的hashCode()返回必须一样。

  3. 如果两个对象equals() 比较不相等,他们的hashCode()返回不需要不一样。

提出几个问题
既然hashCode返回的值与对象的内存的地址有关,那在JVM中发生垃圾回收处理(GC)后,对象的内存地址必然发生了变化,那hashCode的返回值是否会改变?

不会。

第一次调用hashCode()方法,会根据对象的内存地址计算得到一个int类型数值,并保存在对象头中。之后再调用,直接取用这个值。当然这只对没有被我们重写的hashCode方法。重写过的hashCode每次调用都会进行计算,这也要求我们重写hashCode()方法时注意返回的值不能随意改变。

下图截取自 周志明的《深入理解java虚拟机》

java中hashCode()源码小记

 

为什么equals和hashCode方法要求一起重写

主要是由于 hashMap等集合类中对hashCode方法和equals方法的使用,如果不规范随意的重写这两个方法可能会导致hashMap集合类不能正常使用,重写hashCode()和equals()方法要严格遵守源码注释中提到的 Contract。

从openjdk中看hashCode实现

由于Oracle jdk的开源协议问题,由c/c++实现的本地方法代码文件被压解成dll库文件,我们不能直接查看,选择下载 openjdk查看

openjdk官网

hashCode()方法在openjdk中的实现应该是synchronizer.cpp文件,路径是 hotspot/src/share/vm/runtime/synchronizer.cpp

贴出一部分代码:

static inline intptr_t get_next_hash(Thread * Self, oop obj) {
 intptr_t value = 0 ;
 if (hashCode == 0) {
    // This form uses an unguarded global Park-Miller RNG,
    // so it‘s possible for two threads to race and generate the same RNG.
    // On MP system we‘ll have lots of RW access to a global, so the
    // mechanism induces lots of coherency traffic.
    value = os::random() ;
} else
 if (hashCode == 1) {
    // This variation has the property of being stable (idempotent)
    // between STW operations. This can be useful in some of the 1-0
    // synchronization schemes.
    intptr_t addrBits = cast_from_oop<intptr_t>(obj) >> 3 ;
    value = addrBits ^ (addrBits >> 5) ^ GVars.stwRandom ;
} else
 if (hashCode == 2) {
    value = 1 ;            // for sensitivity testing
} else
 if (hashCode == 3) {
    value = ++GVars.hcSequence ;
} else
 if (hashCode == 4) {
    value = cast_from_oop<intptr_t>(obj) ;
} else {
    // Marsaglia‘s xor-shift scheme with thread-specific state
    // This is probably the best overall implementation -- we‘ll
    // likely make this the default in future releases.
    unsigned t = Self->_hashStateX ;
    t ^= (t << 11) ;
    Self->_hashStateX = Self->_hashStateY ;
    Self->_hashStateY = Self->_hashStateZ ;
    Self->_hashStateZ = Self->_hashStateW ;
    unsigned v = Self->_hashStateW ;
    v = (v ^ (v >> 19)) ^ (t ^ (t >> 8)) ;
    Self->_hashStateW = v ;
    value = v ;
}
?
 value &= markOopDesc::hash_mask;
 if (value == 0) value = 0xBAD ;
 assert (value != markOopDesc::no_hash, "invariant") ;
 TEVENT (hashCode: GENERATE) ;
 return value;
}

我们在运行一个java进程时,可以通过-XX:hashCode=参数来指定hashCode方法返回值的生成策略,默认是最后一种情况。一般情况下我们使用默认的生成策略,有兴趣的同学可以自行下载 openjdk的源码查看。

进入openjdk官网

java中hashCode()源码小记

 

java中hashCode()源码小记

选择browse下载方式,windows选择zip格式下载即可

java中hashCode()源码小记

java中hashCode()源码小记

上一篇:C语言输入一个带空格的字符串求单词个数


下一篇:Java异常