第六周:Hash Tables

1. separate chaining

  1. 思路

    • 键一个长为M的数组,每一个entry是一个linked-list
    • Hash:给每个key赋予一个整数结余0到M-1的i作为这个key的hash code
    • Insert:插入第i条链(linked-list)的最前面
    • Search:只需要查找第i条链(linked-list)
  2. 分析

    • M太大 => 太多空链
    • M太小 => 链太长
    • 一般:M ~ N/5 => constant-time ops
  3. Java implementation

    public class SeparateChainingHashST<Key, Value>{
     private int M = 97; // number of chains
     private Node[] st = new Node[M];    // array of chains
    
     private static class Node{
         private Object key;
         private Object val;
         private Node next;
    
         public Node(Object key, Object val, Node next){
             this.key = key;
             this.val = val;
             this.next = next;
         }
    
     }
    
     private int hash(Key key){
         return (key.hashCode() & 0x7fffffff) % M;
     }
    
     public Value get(Key key){
         int i = hash(key);
         for(Node x = st[i]; x != null; x = x.next){
             if(key.equals(x.key)){
                 return (Value)x.val;
             }
         }
         return null;
     }
    
     public void put(Key key, Value val){
         int i = hash(key);
         for(Node x = st[i]; x != null; x = x.next){
             if(key.equals(x.key)){
                 x.val = val;
                 return;
             }
         }
         st[i] = new Node(key, val, st[i]);
    
     }
    }
    

2. Linear probing

  1. 思路

    • 用一个长为M的数组st

    • Hash:给每个key赋予一个整数结余0到M-1的 i 作为这个key的hash code

    • Insert:如果st[i]是空的,插入值;如果已经被占了,继续试st[i+1]st[i+2],以此类推

    • Search:找st[i];如果被占但不符合我们要找的key,继续试st[i+1]st[i+2],以此类推

    • 注意:st数组长度M必须大于键值对数量N

  2. 分析

    • M太大 => 太多空的array entries

    • M太小 => search time blows up

    • 一般:\alpha = N/M ~ \frac{1}{2}

      • # probes for search hit is about 3/2

      • # probes for search miss is about 5/2

  3. Java Implementation

    public class LinearProbingHashST<Key, Value>{
     private int M = 30001;
     private Value[] vals = (Value[]) new Object[M];
     private Key[] keys = (Key[]) new Object[M];
    
     private int hash(Key key){
         return (key.hashCode() & 0x7fffffff) % M;
     }
    
     public void put(Key key, Value val){
         int i = 0;
         for(i = hash(key); keys[i] !=null; i = (i+1)%M){
             if(keys[i].equals(key)){
                 break;
             }
         }
         keys[i] = key;
         vals[i] = val;
     }
    
     public Value get(Key key){
         for(int i = hash(key); keys[i] != null; i = (i+1)%M){
             if(key.equals(keys[i])){
                 return vals[i];
             }
         }
         return null;
     }
    }
    
  1. ST implementations: summary

    implemen guarantee average ordered ops? operations on keys
    sequential search (unordered list) Search: N
    insert: N
    delete: N
    Search: N/2
    insert: N
    delete: N/2
    No equals()
    binary search (ordered array) Search: lgN
    insert: N
    delete: N
    Search: lgN
    insert: N/2
    delete: N/2
    Yes compareTo()
    BST Search: N
    insert: N
    delete: N
    Search: 1.39lgN
    insert: 1.39lgN
    delete: \sqrt N
    Yes compareTo()
    Red-black BST Search: 2lgN
    insert: 2lgN
    delete: 2lgN
    Search: lgN
    insert: lgN
    delete: lgN
    Yes compareTo()
    separate chaining Search: lgN
    insert: lgN
    delete: lgN
    Search: 3-5
    insert: 3-5
    delete: 3-5
    No equals()
    hashCode()
    Linear probing Search: lgN
    insert: lgN
    delete: lgN
    Search: 3-5
    insert: 3-5
    delete: 3-5
    No equals()
    hashCode()
  • Separate chaining
    • easier to implement delete
    • performance degrades gracefully
    • clustering less sensitive to poorly-designed hash function
  • Linear probing
    • less wasted space
    • better cache performance
  • Hash tables
    • Simple to code
    • No effective alternative for unordered keys
    • Faster for simple keys
    • Better system support in Java for Strings
    • Hash tables: java.util.HashMapjava.util.IdentityHashMap
  • Baleced search trees
    • Stronger performance guarantee
    • Support for ordered ST operations
    • Easier to implement compareTo() correctly than equals() and hashCode()
    • Red-black BSTs: java.util.TreeMapjava.util.TreeSet
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容