11. Hash Tables 1

A hash table is a generalization of the simpler notion of an ordinary array.

Direct addressing is a simple technique that works well when the universe U of the keys (all possible values of k) is reasonably small.

Operations on direct-address tables

we use a direct-address table, denoted by T[0..m−1], in which each position, or slot, corresponds to a key in the universe U .

  1. DIRECT ADDRESS SEARCH(T,k) return T [k]
  2. DIRECT ADDRESS INSERT(T,x) T [key[x]] ← x
  3. DIRECT ADDRESS DELETE(T,x) T[key[x]] ← NIL

Hash tables

With hashing, the element is stored in slot h(k),
i.e., we use a hash function h to compute the slot for the element using key k, where h maps the universe U of keys into the slots of a hash table T [0. . m − 1].

  1. We say that an element with key k hashes to slot h(k).
  2. We also say that h(k) is the hash value of key k.

Notice that with direct addressing, an element with key k is stored in slot k, which is a very special hash table.

The drawback of any hash tables

The drawback of any hash tables is the collision when two different keys are mapped to the same slot.

One effective way to resolve collisions is called chaining, which works as follows: Put all elements that hash to the same slot in a linked list.

Operations on hash tables

The directory operations on a hash table T are easy to implement when collisions are resolved by chaining.

  1. CHAINED HASH SEARCH(T,k)
    search for an element with key k in the linked list T [h(k)]
  2. CHAINED HASH INSERT(T,x)
    insert x at the head of the linked list T[h(key[x])]
  3. CHAINED HASH DELETE(T,x)
    delete x from the linked list T [h(key[x])]

Analysis of simple uniform hashing with chaining

A simple uniform hashing assumes that any given element is equally likely to hash into any of the m slots, independently of where any other element has hashed to.

The average behavior of hashing under the simple uniform hashing assumption is much better, which takes Θ(1 + α) time.

Let the hash table contain m slots. For j = 0,...,m−1, denote by nj the length of the linked list T[j], so that n=n0+n1+...+n m−1, and the average value of nj is E[nj] = α = n/m.


the relationship between the load factor α and the time of searching/deletion of an element:Θ(1 + α)

Case 1: Unsuccessful search for key k:
The linked list T ( j) for hash value h(k) (= j) has to traversed. The expected length of
T(j) is E[nj] = α = n/m.
Case 2: Successful search for key k: Let ki = key[xi]. For keys ki and kj, denote by
Xij = I{h(ki) = h(kj)} a random variable. Pr{h(ki) = h(kj)} = 1/m. Thus, E[Xij] = 1/m.

The amount of time spent on this linked list (to identify k(i)) is proportional to the number of keys before it in the linked list, while the probability of a key k(j) in front of key k(i) in the linked list with j > i is 1/m.

The average time complexity of successful search for key ki thus is

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • **2014真题Directions:Read the following text. Choose the be...
    又是夜半惊坐起阅读 9,998评论 0 23
  • 我是个害羞男孩。 一 第一次的遇见是在车站,我记不清入口和出口的方向,我是一条彻头彻尾的路癌,我辨识不了人们口中的...
    方成学长阅读 656评论 1 4
  • 花了近5个月,终于读完了孙皓晖先生这部500余万字的皇皇巨著。已经很久没有这么酣畅淋漓地读大部头了,说实话,这部书...
    开火箭的拖拉机阅读 398评论 0 2
  • 1.学到的重要概念: 听力材料中说话的人用一个一个的单词来表达重要的信息,要听其中的标志单词。 2.让我怦然心...
    114靳玉芳阅读 448评论 5 0
  • 农历腊月初八,外婆走了,从此世界上少了一位慈祥的老人,妈妈失去了养育她的母亲,我失去了疼爱自己的外婆! 阳历201...
    一颗浮萍阅读 814评论 3 2