背景
题目有点复杂......语文太差....贴上代码吧
public static void containTest() {
List<String> list = new ArrayList<>();
list.add("a");
Set<List<String>> set = new HashSet<>();
set.add(list);
System.out.println(set.contains(list));
list.add("b");
System.out.println(set.contains(list));
}
上面的代码返回如下:
true
false
而对于如下代码:
public static void sbContainTest() {
StringBuilder sBuilder = new StringBuilder("a");
Set set = new HashSet<>();
set.add(sBuilder);
System.out.println(set.contains(sBuilder));
sBuilder.append("b");
System.out.println(set.contains(sBuilder));
}
返回的结果却是:
true
true
我们来通过查看源码来了解这背后的原理.
分析
首先查看HashSet的源码, 大致贴一下其内部域以及几个和本文相关的方法如下:
public class HashSet<E>
extends AbstractSet<E>
implements Set<E>, Cloneable, java.io.Serializable{
static final long serialVersionUID = -5024744406713321676L;
private transient HashMap<E, Object> map;
// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();
public HashSet(){
map = new HashMap<>();
}
public boolean add(E e) {
return map.put(e, PRESENT) == null;
}
public boolean contains(Object o) {
return map.containsKey(o);
}
//other methods
}
可以看到HashSet内部是使用一个HashMap存储数据的, 而contains方法调用的是HashMap
的contains
方法.
我们再查看HashMap
的源码, 看看contiansKey
的实现. 同上, 贴一些关键代码
public class HashMap<K,V> extends AbstractMap<K,V>
implements Map<K,V>, Cloneable, Serializable {
private static final long serialVersionUID = 362498820763181265L;
transient Node<K, V>[] table;
transient int size;
public boolean containsKey(Object key) {
return getNode(hash(key), key) != null;
}
public V put(K key, V value) {
return putVal(hash(key), key, value, false, true);
}
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null); //look at this line!!!!!
else {
Node<K,V> e; K k;
if (p.hash == hash &&
((k = p.key) == key || (key != null && key.equals(k))))
e = p;
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
p.next = newNode(hash, key, value, null);
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
treeifyBin(tab, hash);
break;
}
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
break;
p = e;
}
}
if (e != null) { // existing mapping for key
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
afterNodeAccess(e);
return oldValue;
}
}
++modCount;
if (++size > threshold)
resize();
afterNodeInsertion(evict);
return null;
}
/**
*@param: hash hash for key, key the key
*@return: the node, or null if none
**/
final Node<K, V> getNode(int hash, Object key) {
Node<K, V>[] tab;
Node<K, V> first, e;
int n;
K k;
if (( tab = table) != null && (n = tab.length) > 0 && (first = tab[(n-1) & hash]) != null) {
if (first.hash == hash && ((k = first.key) == key || (key != null && key.equals(k)))) {
return first;
}
if ((e = first.next) != null) {
if (first isinstanceof TreeNode){
return ((TreeNode<K, V> first).getTreeNode(hash, key);
do {
if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) {
return e;
} while ((e = e.next) != null);
}
}
return null
}
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>>16);
}
}
通过查看上面的代码我们就知道原因了, 简单地用语言描述一下
原因
我们运行如下代码:
public static void hashCodeTest() {
List<String> list = new ArrayList<>();
list.add("a");
System.out.println(list.hashCode());
list.add("b");
System.out.println(list.hashCode());
StringBuilder sBuilder = new StringBuilder("a");
System.out.println(sBuilder.hashCode());
sBuilder.append("b");
System.out.println(sBuilder.hashCode());
}
返回结果如下
128
4066
1094834071
1094834071
首先我们设list里面只有"a"时的hashcode的值为a
, 添加了"b"之后的hashcode的值为b
.
当我们第一运行set.add(list)
时候
当运行set.add(list)
时, 注意观察上面我贴的HashMap的源码第21行(我用look at this line!!!!!
注释了), 可知HashMap是根据此时的hashcode(也就是a
)来确定list存储在set中的索引.
而我们第二次运行set.contains(list)
时HashMap会用新的索引(也就是b
)来寻找目标, 自然找不到了. 于是就返回了false
.
而sBuilder的hashcode的值没有改变, 自然也就能被找到, 返回true
了.
收获
以后使用和hash相关的数据结构的时候一定要注意 hash值得变化, 变了可能就找不到了. hash固然快, 但是也是有缺点呐.