重写equals 方法一定要重写hashcode 吗?

equals & hashCode 方法

java 是oop的,而Object 对象是所有java对象的父类,其中equals 和hashCode就是这个对象的两个方法,可见其的重要性。

hashCode到底是什么?

在《白话 Synchronized》https://www.jianshu.com/p/dd87f60ff37c 一文中,笔者提到过hashcode 其本质是一个随机数。

hash contains the identity hash value: largest value is
//    31 bits, see os::random()

我们查看os::random 方法发现

long os::random() {
  /* standard, well-known linear congruential random gener�ator with
   * next_rand = (16807*seed) mod (2**31-1)
   * see
   * (1) "Random Number Generators: Good Ones Are Hard to Find",
   *      S.K. Park and K.W. Miller, Communications of the ACM 31:10 (Oct 1988),
   * (2) "Two Fast Implementations of the 'Minimal Standard' Random
   *     Number Generator", David G. Carta, Comm. ACM 33, 1 (Jan 1990), pp. 87-88.
  */
 

LCG 随机算法: https://en.wikipedia.org/wiki/Linear_congruential_generator

hashCode() 的取值

JNIEXPORT jint JNICALL
Java_java_lang_System_identityHashCode(JNIEnv *env, jobject this, jobject x)
{
    return JVM_IHashCode(env, x);
}


JVM_ENTRY(jint, JVM_IHashCode(JNIEnv* env, jobject handle))
  JVMWrapper("JVM_IHashCode");
  // as implemented in the classic virtual machine; return 0 if object is NULL
  return handle == NULL ? 0 : ObjectSynchronizer::FastHashCode (THREAD, JNIHandles::resolve_non_null(handle)) ;
JVM_END

intptr_t ObjectSynchronizer::FastHashCode (Thread * Self, oop obj) {
//偏向锁
  if (UseBiasedLocking) {
    // NOTE: many places throughout the JVM do not expect a safepoint
    // to be taken here, in particular most operations on perm gen
    // objects. However, we only ever bias Java instances and all of
    // the call sites of identity_hash that might revoke biases have
    // been checked to make sure they can handle a safepoint. The
    // added check of the bias pattern is to avoid useless calls to
    // thread-local storage.
    if (obj->mark()->has_bias_pattern()) {
      // Box and unbox the raw reference just in case we cause a STW safepoint.
      Handle hobj (Self, obj) ;
      // Relaxing assertion for bug 6320749.
      assert (Universe::verify_in_progress() ||
              !SafepointSynchronize::is_at_safepoint(),
             "biases should not be seen by VM thread here");
      BiasedLocking::revoke_and_rebias(hobj, false, JavaThread::current());
      obj = hobj() ;
      assert(!obj->mark()->has_bias_pattern(), "biases should be revoked by now");
    }
  }

  // hashCode() is a heap mutator ...
  // Relaxing assertion for bug 6320749.
  assert (Universe::verify_in_progress() ||
          !SafepointSynchronize::is_at_safepoint(), "invariant") ;
  assert (Universe::verify_in_progress() ||
          Self->is_Java_thread() , "invariant") ;
  assert (Universe::verify_in_progress() ||
         ((JavaThread *)Self)->thread_state() != _thread_blocked, "invariant") ;

  ObjectMonitor* monitor = NULL;
  markOop temp, test;
  intptr_t hash;
  markOop mark = ReadStableMark (obj);

  // object should remain ineligible for biased locking
  assert (!mark->has_bias_pattern(), "invariant") ;

  if (mark->is_neutral()) {
    hash = mark->hash();              // this is a normal header
    //正常对象
    if (hash) {                       // if it has hash, just return it
      return hash;
    }
    hash = get_next_hash(Self, obj);  // allocate a new hash code
    temp = mark->copy_set_hash(hash); // merge the hash code into header
    // use (machine word version) atomic operation to install the hash
    test = (markOop) Atomic::cmpxchg_ptr(temp, obj->mark_addr(), mark);
    if (test == mark) {
      return hash;
    }
    // If atomic operation failed, we must inflate the header
    // into heavy weight monitor. We could add more code here
    // for fast path, but it does not worth the complexity.
    //重量级锁
  } else if (mark->has_monitor()) {
    monitor = mark->monitor();
    temp = monitor->header();
    assert (temp->is_neutral(), "invariant") ;
    hash = temp->hash();
    if (hash) {
      return hash;
    }
    // Skip to the following code to reduce code size
  } else if (Self->is_lock_owned((address)mark->locker())) {
 //轻量级锁
    temp = mark->displaced_mark_helper(); // this is a lightweight monitor owned
    assert (temp->is_neutral(), "invariant") ;
    hash = temp->hash();              // by current thread, check if the displaced
    if (hash) {                       // header contains hash code
      return hash;
    }
    // WARNING:
    //   The displaced header is strictly immutable.
    // It can NOT be changed in ANY cases. So we have
    // to inflate the header into heavyweight monitor
    // even the current thread owns the lock. The reason
    // is the BasicLock (stack slot) will be asynchronously
    // read by other threads during the inflate() function.
    // Any change to stack may not propagate to other threads
    // correctly.
  }

  // Inflate the monitor to set hash code
  monitor = ObjectSynchronizer::inflate(Self, obj);
  // Load displaced header and check it has hash code
  mark = monitor->header();
  assert (mark->is_neutral(), "invariant") ;
  hash = mark->hash();
  if (hash == 0) {
    hash = get_next_hash(Self, obj);
    temp = mark->copy_set_hash(hash); // merge hash code into header
    assert (temp->is_neutral(), "invariant") ;
    test = (markOop) Atomic::cmpxchg_ptr(temp, monitor, mark);
    if (test != mark) {
      // The only update to the header in the monitor (outside GC)
      // is install the hash code. If someone add new usage of
      // displaced header, please update this code
      hash = test->hash();
      assert (test->is_neutral(), "invariant") ;
      assert (hash != 0, "Trivial unexpected object/monitor header usage.");
    }
  }
  // We finally get the hash
  return hash;
}

因为对象的word - mark 会随着锁升级,存储位置会发生改变,具体改变规则见笔者上一博文《白话 Synchronized》,所以通过以上代码我们发现获取hashcode 时也需要判断不同的锁级别。

hashcode 方法的官方文档

 /**
     * Returns a hash code value for the object. This method is
     * supported for the benefit of hash tables such as those provided by
     * 返回object 的hash code,这个方法为了支持hash table 的益处,比如象hashmap 所提供的。
     * {@link java.util.HashMap}.
     * <p>
     * The general contract of {@code hashCode} is:
     *  hash code 的常规合约是
     * <ul>
     * <li>Whenever it is invoked on the same object more than once during
     *     an execution of a Java application,
     *
     *     在一个应用程序执行期间(进程),同一个对象,任意时间,不管被执行多少次,
     *
     *     the {@code hashCode} method
     *     must consistently return the same integer,
     *
     *     hashCode 方法必须一致地返回相同的integer.
     *
     *     (provided 引导条件状语从句)
     *     provided no information
     *     used in {@code equals} comparisons on the object is modified.
     *
     *     前提是参与对象比较的信息没有被修改。
     *
     *     This integer need not remain consistent from one execution of an
     *     application to another execution of the same application.
     *
     *     同一个应用程序的不同进程,不需要保持hashCode的一致.
     *
     * <li>If two objects are equal according to the {@code equals(Object)}
     *     method, then calling the {@code hashCode} method on each of
     *     the two objects must produce the same integer result.
     *
     *     如果两个对象通过equals 方法比较后是"true"(是相等的),那么这两个对象必须生成相同的integer 结果。
     *
     * <li>It is <em>not</em> required that if two objects are unequal
     *     according to the {@link java.lang.Object#equals(java.lang.Object)}
     *     method, then calling the {@code hashCode} method on each of the
     *     two objects must produce distinct integer results.  However, the
     *     programmer should be aware that producing distinct integer results
     *     for unequal objects may improve the performance of hash tables.
     *     如果两个对象通过"equals"方法的结果是不相等,那么两个不同的对象不必生成相同的integer结果。
     *     然后作为程序员应该知道为不相等的对象生成不同的integer 结果会提升hash tables 的性能。
     * </ul>
     * <p>
     * As much as is reasonably practical, the hashCode method defined by
     * class {@code Object} does return distinct integers for distinct
     * objects. (This is typically implemented by converting the internal
     * address of the object into an integer, but this implementation
     * technique is not required by the
     * Java&trade; programming language.)
     *
     * 尽可能的合理实践,"Object" 对象的hashcode 方法的实现为不同的对象返回了不同的integer。
     * (典型的实现是将对象的内部地址转换为integer,但是这种实现技术不需要java 语言)
     *
     * @return  a hash code value for this object.
     * @see     java.lang.Object#equals(java.lang.Object)
     * @see     java.lang.System#identityHashCode
     */
    public native int hashCode();

从官方文档读到以下信息

  • hashCode 用途: 对hash table 有好处
  • jdk 对hashCode的三个规约(见上文翻译)
  • hashCode vs 地址
    1. hashCode 并非对象地址
    2. hashCode 和地址之间并无函数关系
    3. hashCode 和地址之间只是映射关系(由c实现)
      4.实际上不同对象的hashCode 有可能相同
package com.sparrow.jdk.os;

import java.util.HashSet;
import java.util.Set;

/**
 * @author by harry
 */
public class RandomTest {
    public static void main(String[] args) {
        //return distinct integers for distinct objects
        Set<Integer> s = new HashSet<>();
        int i = 0;
        while (true) {
            Object o = new Object();
            if (!s.contains(o.hashCode())) {
                s.add(o.hashCode());
            } else {
                System.out.println(o.hashCode());
            }
        }
    }
}

equals 方法

 * The {@code equals} method for class {@code Object} implements
     * the most discriminating possible equivalence relation on objects;
     * Object 的equals 的实现方法最具识别能力;
     * that is, for any non-null reference values {@code x} and
     * {@code y}, this method returns {@code true} if and only
     * if {@code x} and {@code y} refer to the same object
     * ({@code x == y} has the value {@code true}).
     * 对于引用的值x 和y,当且仅当x和y引用同一个对象(x==y=true)时,x.eques(y)的结果才为true
     * <p>
     * Note that it is generally necessary to override the {@code hashCode}
     * method whenever this method is overridden, so as to maintain the
     * general contract for the {@code hashCode} method, which states
     * that equal objects must have equal hash codes.
     *
     * 注意:当这个方法被重写时通常需要重写 hashCode 方法,为保持 hashCode 方法的常规合约 即如果两个对象相等,则hash code 一定相同。
     *
     * @param   obj   the reference object with which to compare.
     * @return  {@code true} if this object is the same as the obj
     *          argument; {@code false} otherwise.
     * @see     #hashCode()
     * @see     java.util.HashMap
     */
    public boolean equals(Object obj) {
        return (this == obj);
    }

equals的几个性质本文没有列出,重点看一下Object 方法的具体实现和注意:

 Note that it is generally necessary to override the {@code hashCode}
     * method whenever this method is overridden, so as to maintain the
     * general contract for the {@code hashCode} method, which states
     * that equal objects must have equal hash codes.

笔者理解:这里的提示并非强制,而是建议我们在重写equals方法时重写hashcode。

equals 和hashcode 的关系

通过以上分析,如果只是为了重写equals 方法而重写hashcode,那么两者之间并无直接关系。
那么我们思考,为什么jdk这样提示?
为什么需要hashcode 要满足三大规约?
从hascode的官方文档注释可知,hashcode 为hash table 服务。jdk 中的hash 实现包括hash set ,hash map ,hash table 及concurrent hash map 等。
而hashcode 有利于提高hash table 的性能。为了更进一步分析hash code 与hash table 的关系,我们简单分析下hash map的代码 jdk 1.8

hash map

hash-map.jpg
 /**
     * The default initial capacity - MUST be a power of two.
     * 默认的初始化容量 一定是2的n次幂 下文会解释为什么?
     */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

    /**
     * The maximum capacity, used if a higher value is implicitly specified
     * by either of the constructors with arguments.
     * MUST be a power of two <= 1<<30.
    * 最大容量
     */
    static final int MAXIMUM_CAPACITY = 1 << 30;

    /**
     * The load factor used when none specified in constructor.
     *负载因子
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    /**
     * The bin count threshold for using a tree rather than list for a
     * bin.  Bins are converted to trees when adding an element to a
     * bin with at least this many nodes. The value must be greater
     * than 2 and should be at least 8 to mesh with assumptions in
     * tree removal about conversion back to plain bins upon
     * shrinkage.
     * 升级红黑树的阀值
     */
    static final int TREEIFY_THRESHOLD = 8;

    /**
     * The bin count threshold for untreeifying a (split) bin during a
     * resize operation. Should be less than TREEIFY_THRESHOLD, and at
     * most 6 to mesh with shrinkage detection under removal.
    * 降为链表的阀值
     */
    static final int UNTREEIFY_THRESHOLD = 6;

    /**
     * The smallest table capacity for which bins may be treeified.
     * (Otherwise the table is resized if too many nodes in a bin.)
     * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
     * between resizing and treeification thresholds.
    *  升级为红黑树的最小表容量阀值
     */
    static final int MIN_TREEIFY_CAPACITY = 64;
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;//p-->pointer
       //表为空
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
       //表的糟为空 为什么?
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k; e-->exist node
            //这里是本文重点,如何判断是否存在
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                //
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        //default is 8 
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
//这里可能会升级红黑树,但不一定,需要进一步判断
                            treeifyBin(tab, hash);

                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;

        //超过初始容量
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }


 /**
     * Replaces all linked nodes in bin at index for given hash unless
     * table is too small, in which case resizes instead.
     */
    final void treeifyBin(Node<K,V>[] tab, int hash) {
        int n, index; Node<K,V> e;
        //default 64  没过超最小表容量,则不升级红黑树,只是resize
        if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
            resize();
        else if ((e = tab[index = (n - 1) & hash]) != null) {
            TreeNode<K,V> hd = null, tl = null;
   //升级红黑树
            do {
                TreeNode<K,V> p = replacementTreeNode(e, null);
                if (tl == null)
                    hd = p;
                else {
                    p.prev = tl;
                    tl.next = p;
                }
                tl = p;
            } while ((e = e.next) != null);
            if ((tab[index] = hd) != null)
                hd.treeify(tab);
        }
    }

hash code equals 在hash map的作用

  • hash map 是通过key 的hashcode 映射到hash map中,所以要保证(这里参见hash code 的三大规约), 如果对象内容相同,其hashcode一定相同,而object 的hashcode 是随机生成(大部分情况下不同),所以这里必须重写,否则hash map 中会存在内容相同的对象。
  • get put 方法排重时要先判断hash code 是否相同再判断equals方法是否相同,因为这样提高查询的性能。因为hashcode 不同,对象一定不同。

hash map 的容量为什么要求是2 的n次幂

if ((p = tab[i = (n - 1) & hash]) == null)

这里n为hash map 的容量(2的n次幂) hash是当前key的hash code.
通过以上公式可以通过按位与的方式直接获取当前key 所在的糟,性能会更好。
因为2 的n次幂的数都有一个特点,就是二进制数只有一位为1,而减1后的结果是除为1的那一位,其余低于该位全为1,所以取余运算就可以转换成与该数进行按位与即可。
举个例子
如果n=16 2的4次幂 (16-1)二进制为1111 全为1
而hash code 假如为19 二进制为10011 按位与的结果为011即为3 19%16=3
同样的代码在netty中也同样的实现

DefaultEventExecutorChooserFactory

private static final class PowerOfTwoEventExecutorChooser implements EventExecutorChooser {
        private final AtomicInteger idx = new AtomicInteger();
        private final EventExecutor[] executors;

        PowerOfTwoEventExecutorChooser(EventExecutor[] executors) {
            this.executors = executors;
        }

        @Override
        public EventExecutor next() {
            return executors[idx.getAndIncrement() & executors.length - 1];
        }
    }
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 213,864评论 6 494
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,175评论 3 387
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 159,401评论 0 349
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,170评论 1 286
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,276评论 6 385
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,364评论 1 292
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,401评论 3 412
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,179评论 0 269
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,604评论 1 306
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,902评论 2 328
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,070评论 1 341
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,751评论 4 337
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,380评论 3 319
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,077评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,312评论 1 267
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,924评论 2 365
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,957评论 2 351