Hash
Theorem
HashMap,HashSet,HashTable,
HashFunction 31
Open & close
D: LinkedList
D: null vs available
Rehashing
D: negative number process
Application
LRU
D: HashMap and Node(LinkedList)
Heap
PriorityQueue
TreeMap
0 Outline
这一章的脉络是什么?
介绍Queue和Stack的用处,引申到Hash,然后是介绍hash的基本性质,相关题目如LRU Cache,然后介绍Heap和相关性质,了解TreeMap和PriorityQueue和TreeSet
1 Data structure
Queue, stack, Hash是什么,有哪些操作,复杂度是?
Queue: BFS Push Pop Top都是O(1)
Stack: DFS with no recursion Push Pop Top都是O(1)
Hash: Insert Find Delete都是O(1)
2 Hash
Theorem
什么是Hash?Hash table/hash map/hash set/cocurrent hash map
java这些数据结构的区别?
https://blog.csdn.net/spicehava1/article/details/80278915
https://blog.csdn.net/weixin_34383618/article/details/85993919
Hashset no key
Hash table: thread safe but slow versus. Hash map
cocurrent hashmap支持线程安全,且比hashtable快-
Hash function: magic number 31 and MD5 SHA-1 SHA-2
Sum = 0
Open Hashing vs Closed Hashing
如果键值冲突了怎么处理?
用链表进行处理,
OH: linkedlist
CH: when remove an element, should label available not null
https://blog.csdn.net/yue_hu/article/details/80661438Rehashing
什么时候Rehash比较好,怎么处理比较好?
饱和度When the size of keys is larger than 1/10 hash table capacity, need to rehash
Rehashing
http://www.lintcode.com/problem/rehashing/ http://www.jiuzhang.com/solutions/rehashing/
Iterate the table, add a while loop inside, use new or dummy to new
D: newIndex = (hashTable[i] % newCap + newCap) % newCap
https://stackoverflow.com/questions/90238/whats-the-syntax-for-mod-in-java
Performing the same operation with the "%" or rem operator maintains the sign of the X value. If X is negative you get a result in the range (-Y, 0]. If X is positive you get a result in the range [0, Y).
长度乘以2,遍历原来的表,添加到新的表上面去一致性hash是什么?
https://juejin.im/post/5ae1476ef265da0b8d419ef2
分布在0-2^32的圆上,节点通过hash分布在圆上?节点的hash值,所有键hash完了顺时针找对应存储的机器,如果添加节点,只要改新增节点和上个节点的数据,如果减少节点,只要改删除节点和上个节点的数据;如果节点过少,就增加多个虚拟节点,尽可能保证数据分布均匀
Application
LRU cache是什么?
可以实现两个操作,获取键值和设置键值,当获取不到的时候,返回-1,设置键值如果cache满了,要把最不常用的那个键值删了
什么是python defaultdict和orderedDict?
https://docs.python.org/zh-cn/3/library/collections.html
defaultdict提供一个默认值得dict,OrderedDict创建记住键值 最后 插入顺序的有序字典变体很简单。 如果新条目覆盖现有条目,则原始插入位置将更改并移至末尾
Python OrderDict()怎么简单实现?
使用move_to_end和popitem,根据last参数来决定先进先出
https://docs.python.org/zh-cn/3/library/functools.html#functools.lru_cache
LRU Cache
https://leetcode.com/problems/lru-cache/
https://leetcode.com/problemset/all/?search=LRU%20Cache
https://www.jiuzhang.com/solutions/?search=LRU%20Cache
Python使用OrderedDict
https://leetcode.com/problems/lru-cache/discuss/45952/Python-concise-solution-with-comments-(Using-OrderedDict).
https://leetcode.com/problems/lru-cache/discuss/45964/Very-short-solution-using-Python's-OrderedDict
pop(key)会弹出key,popitem(last=False)会FIFO,读的时候有的话就先pop再设置值;写的时候,如果不是增加,就跟读类似的操作,如果是增加要看有没有超过容量,超过了再加Python使用双向链表
https://leetcode.com/problems/lru-cache/discuss/45926/Python-Dict-%2B-Double-LinkedList
就是用一个双向链表存顺序,dict来存内容-
Use LinkedList + Hashmap
- Doubly LinkedList
Store current node in value in hash map
Current.prev.next = current.next
Current.next.prev = current. prev - //Singly LinkedList
Store prev node in value in hashmap
Then do prev.next = prev.next.next then can remove the current
- Doubly LinkedList
Related questitons
Subarray sum
https://leetcode.com/problemset/all/?search=Subarray%20Sum
https://www.jiuzhang.com/solutions/?search=Subarray%20Sum
Copy list with random pointer
https://leetcode.com/problemset/all/?search=Copy%20List%20with%20Random%20Pointer
https://www.jiuzhang.com/solutions/?search=Copy%20List%20with%20Random%20Pointer
Anagrams
https://leetcode.com/problemset/all/?search=Anagrams
https://www.jiuzhang.com/solutions/?search=Anagrams
Longest consecutive sequence
https://leetcode.com/problemset/all/?search=Longest%20Consecutive%20Sequence
https://www.jiuzhang.com/solutions/?search=Longest%20Consecutive%20Sequence
3 Heap
Heap是什么?
支持操作O(log N) Add / O(log N) Remove / O(1) 查询Min or Max,本质是二叉树,
具体在各个语言的实现?
Java: PriorityQueue
C++: priority_queue
Python: heapq 父节点的只小于或等于子节点,具体使用其实是用列表实现的https://blog.csdn.net/jamfiy/article/details/88185512
Priority Queue是什么?
Priority Queue和Heap有什么区别吗?如果是JAVA自带的Priority Queue的话,答案是有的(C++的Priority Queue可能也类似,具体要确认一下)。
具体的区别是在remove操作中,Priority Queue的时间复杂度是O(n),而Heap是O(logn)。因为Priority Queue需要找到这个数据,需要O(n)的时间,而Heap借助了HashMap,所以只需要O(1)的时间就可以找到。那为什么Priority Queue不用HashMap呢?这个就是JAVA提供的函数里面没有加上这一块而已,如果自己编程就也可以是O(logn)。
https://blog.csdn.net/roufoo/article/details/80638476
构建一个heap的复杂度,和遍历一个heap的复杂度?
构建堆,自下而上,复杂度是O(n)
遍历是O(nlogn)
Python heapq有哪些操作?
heappush(heap, item) heappop(heap) heappushpop(heap, item) heapify(x)
heapreplace(heap, item)
PriorityQueue vs Heap
UglyNumber II
https://leetcode.com/problemset/all/?search=Ugly%20Number
https://www.jiuzhang.com/solutions/?search=Ugly%20Number
简单来说就是来一个导入一个,来一个导入一个,每次取个最小的,通过这个最小的再加入新的
S: p2, p3, p5, uglys
D: Use p2, p3, p5 on same queue O(n)
怎样得到最后一个数,应该是把所有之前的数乘以2,或3或5,找出最小
HashMap + heap(poll min and add 23*5)
Top K largest Number II
https://www.jiuzhang.com/solutions/?search=Top%20k%20Largest%20Number
简单来说就是来一个存一个,如果过量了就heappop出来
Use priorityQueue: add();peek();poll(); clear();
Queue is interface but stack is class
Queue is implemented by PriorityQueue andLinkedList
Merge K sorted Lists
https://leetcode.com/problemset/all/?search=Merge%20K%20Sorted%20Lists
https://www.jiuzhang.com/solutions/?search=Merge%20K%20Sorted%20Lists
Heap & PriorityQueue
Divide and conquer
Merge two by two
TreeMap
https://docs.oracle.com/javase/7/docs/api/java/util/TreeMap.html
https://leetcode.com/problems/merge-k-sorted-lists/discuss/10513/108ms-python-solution-with-heapq-and-avoid-changing-heap-size
TreeMap是什么?
与HashMap相比,TreeMap是一个能比较元素大小的Map集合,会对传入的key进行了大小排序。其中,可以使用元素的自然顺序,也可以使用集合中自定义的比较器来进行排序;基于红黑树实现
https://www.jianshu.com/p/2dcff3634326
Java document: Key value pairs(ascending order by key)
Get();set(key,value); ceilingEntry,ceilingKey
Related questions
The Skyline Problem
https://leetcode.com/problemset/all/?search=The%20Skyline%20Problem
https://www.jiuzhang.com/solutions/?search=building%20outline
Top K Frequent Words
https://leetcode.com/problemset/all/?search=Top%20K%20Frequent%20Words
https://www.jiuzhang.com/solutions/?search=Top%20K%20Frequent%20Words
用heapq实现,先把每一个的第一个载入,每载入一个取出下一个,思路还是比较多路的最小数
Python Queue模块里面的PriorityQueue是什么
A typical pattern for entries is a tuple in the form: (priority_number, data).
from Queue import PriorityQueue
class Solution(object):
def mergeKLists(self, lists):
"""
:type lists: List[ListNode]
:rtype: ListNode
"""
head = point = ListNode(0)
q = PriorityQueue()
for l in lists:
if l:
q.put((l.val, l))
while not q.empty():
val, node = q.get()
point.next = node
point = point.next
node = node.next
if node:
q.put((node.val, node))
return head.next