以下这段代码是jdk8里默认的String.hashCode方法的实现。这里可以看出实现里采用了一个神奇的魔法数字31.为什么是31而不是32,或者37?
public int hashCode() {
int h = hash;
if (h == 0 && value.length > 0) {
char val[] = value;
for (int i = 0; i < value.length; i++) {
h = 31 * h + val[i];
}
hash = h;
}
return h;
}
以下是一个简单的代码,展示了idea intellij 自带功能重写了hashCode方式。原理与jdk中string.equals相同.
public class Course implements Cloneable,Serializable {
private String name;
private String score;
private String term;
private String id;
//setters、getters略
@Override
public int hashCode() {
int result = name != null ? name.hashCode() : 0;
result = 31 * result + (score != null ? score.hashCode() : 0);
result = 31 * result + (term != null ? term.hashCode() : 0);
result = 31 * result + (id != null ? id.hashCode() : 0);
return result;
}
}
比如 Guava 库中的 Objects.hashCode(), JDK 中的 Arrays.hashCode(),IntelliJ IDEA 生成的 hashCode() 方法使用的算法也与 String.hashCode() 相同。
hashcode=s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
对于为什么采用37,网上主要有两种流行的答案
1、二进制计算中,31 * i = (i << 5) - i
,这样的计算可以转换移位和减法,节约了计算资源
2、基于31可以产生较少的哈希碰撞。
在参考文章[3]里JohnZaj说:As Goodrich and Tamassia point out, If you take over 50,000 English words (formed as the union of the word lists provided in two variants of Unix), using the constants 31, 33, 37, 39, and 41 will produce less than 7 collisions in each case. Knowing this, it should come as no surprise that many Java implementations choose one of these constants.
Coincidentally, I was in the middle of reading the section "polynomial hash codes" when I saw this question.
EDIT: here is link to the ~10mb PDF book i'm referring to above. See section 10.2 Hash Tables (page 413) of Data Structures and Algorithms in Java
参考文章:
[1]. String.hashCode() 实现及 Magic Number 31
[2]. http://bugs.java.com/bugdatabase/view_bug.do?bug_id=4045622
[3]. https://stackoverflow.com/questions/299304/why-does-javas-hashcode-in-string-use-31-as-a-multiplier