Set中的元素为什么不允许重复

为了弄清楚这个问题 , 我又看了一遍 Collection 部分 , 并且看了些其中的源码 , 觉得对其中的实现又明白了一点 , 现在说出来和大家共享 .

我们先看一下 Set 类的关系图：

现在我们就从 Set 说起。

Set 接口为我们提供了一个 add() 方法，以让我们添加元素。所以我们看一下在其实现类 HashSet 中是如何实现的呢？

我们看 HashSet 中的 add() 方法实现；

public boolean add( E o ) {

returnmap.put(o, PRESENT)==null;

}

你可能回问怎么回出来 map 了呢？

Map 又是一个什么变量呢？

我们来看 map 是在在哪定义的。原来，在 HashSet 中定义了这样的两个变量：

private transient HashMap<E,Object> map;

private static final Object PRESENT = new Object();

我们再看一下上面的 add() 方法。

map.put(o, PRESENT)==null

实际执行的是 map 的方法，并且我们添加的对象是 map 中的 key,value 是执行的同一个对象PRESENT.

因为map中的key是不允许重复的，所以set中的元素不能重复。

下面我们再看一下， map 又是如何保证其 key 不重复的呢？

现在我们看一下 map 中的 put() 方法的实现：

HashMap 实现了 map ，在 HashMap 中是这样实现的：

public V put(K key, V value) {

K k = maskNull(key);

int hash = hash(k);

int i = indexFor(hash, table.length);

for (Entry<K,V> e = table[i]; e != null; e = e.next) {

if (e.hash == hash && eq(k, e.key)) {

V oldValue = e.value;

e.value = value;

e.recordAccess(this);

return oldValue;

}

modCount++;

addEntry(hash, k, value, i);

return null;

}

我们我们按照方法的执行一步一步的来看一下，其实现的过程。

K k = maskNull(key);

这一步是要判断当前的要添加的对象的 key 是否为空，如果为空的话，那么就生成一个新的对象作为其 key 。实现如下：

static final Object NULL_KEY = new Object();

* Returns internal representation for key. Use NULL_KEY if key is null.

static <T> T maskNull(T key) {

return key == null ? (T)NULL_KEY : key;

}

之后

int hash = hash(k);

我们看一下 hash() 方法的实现：

static int hash(Object x) {

int h = x.hashCode();

h += ~(h << 9);

h ^= (h >>> 14);

h += (h << 4);

h ^= (h >>> 10);

return h;

}

这一步其实是要得到当前要添加对象的 hashcode, 方法中，首先通过int h = x.hashCode() 取得了当前的要添加对象的 hashcode, 然后

h += ~(h << 9);

h ^= (h >>> 14);

h += (h << 4);

h ^= (h >>> 10);

生成一个新的 hashcode.

接着执行的是

（从此处开始，我理解的比较肤浅，请看到此出的朋友多多指点）

int i = indexFor(hash, table.length);

这个方法是要 Returns index for hash code h.

static int indexFor(int h, int length) {

return h & (length-1);

}

下面就要根据 hashmap 中的元素逐个的比较，看是否相同，如果不同就回添加一个新的元素。是通过循环判断来实现的。

for (Entry<K,V> e = table[i]; e != null; e = e.next) {

if (e.hash == hash && eq(k, e.key)) {

V oldValue = e.value;

e.value = value;

e.recordAccess(this);

这句我的理解是：在内存中的可以访问元素又多了一个。也就是说，添加之后，可以通过hashmap来访问此元素了。

return oldValue;

}

通过循环判断是否有完全相同的对象，包括 hashcode 和 value 值。如果已经存在就返回其值，如果不存在就执行一下操作。

modCount++;

addEntry(hash, k, value, i);

return null;

对象不存在，首先修改 hashmap 的修改次数，即 modCount 加 1. 然后将对象添加到数组中。

void addEntry(int hash, K key, V value, int bucketIndex) {

Entry<K,V> e = table[bucketIndex];

table[bucketIndex] = new Entry<K,V>(hash, key, value, e);

if (size++ >= threshold)

resize(2 * table.length);

}

仍然是数组，我们来看一下 , 在 hashmap 中用来存放对象的数组的定义：

transient Entry[] table;

至此，我想大家也许应该明白了，为什么在 set 中是不允许存放重复值的。

通过才的分析，我们可以看到， map 的一些特征：

1. 在 map 中也是不能存放完全相同的元素的

2. 如果你存入的对象的 key 值已经存在的话，那么新的 value 将会取代老的 value 值，但是并不会添加新的元素进去。

我们可以通过一个测试程序来证明这一点：

public void mapTest() {

Map m = new HashMap();

/**

* we can put the int 1 and the value 1 into the map

* but we cannot get the 1 as int from the map

* why ?

* we only get the 1 as integer from the map,

* so we only get the object from the map,if we put the value into

* the map,we will get one instance of the wrapper of that

m.put(1, 1);

//int i = m.get(1);

Integer i = (Integer) m.get(1);

System.out.println(i);

m.put(1, 1);

System.out.println("map : " + m);

System.out.println("map size : " + m.size());

m.put(1, 2);

System.out.println("map : " + m);

System.out.println("map size : " + m.size());

}

运行的结果是：

map : {1=1}

map size : 1

map : {1=2}

map size : 1

希望此文能大家有所帮助。

Set中的元素为什么不允许重复

推荐阅读更多精彩内容