cache

什么是缓存

缓存就是数据交换的缓冲区（称作Cache），是存贮数据（使用频繁的数据）的临时地方。有cpu缓存、文件系统缓存、应用层缓存等；今天讲的是应用层缓存:应用层缓存指的是从代码层面上，通过代码逻辑和缓存策略，实现对数据、页面、图片等资源的缓存，可以根据实际情况选择将数据存在文件系统或者内存中，减少数据库查询或者读写瓶颈，提高响应效率。
缓存的本质就是用空间换时间，牺牲数据的实时性，以服务器内存中的数据暂时代替从数据库读取最新的数据，减少数据库IO，减轻服务器压力，减少网络延迟，加快页面打开速度。

以下介绍一些常用缓存的设计模式（结构模式）及数据模式（数据结构）

缓存设计模式

Cache-Aside

image.png

最常用模式。应用程序先查缓存，缓存存在则直接返回；不存在时查数据库，缓存并返回。
缺点：缓存与数据库可能不一致，一般需要设置缓存过期时间
这种一般用于数据变动不太频繁、或者实时性要求不高的场景。

Read-Through Cache

image.png

类似cache-aside，不同的是缓存是独立的，应用程序不和数据库直接打交道。一般结合下面的write-through一起使用

Write-Through Cache

image.png

数据修改时先将数据写入缓存，缓存再更新到数据库。与read-through结合可解决数据不一致问题；麻烦的是数据变动都要通知缓存变动（如有多个途径修改数据则比较麻烦），还有就是数据变动较多时数据库压力还是不小的，可用于数据表懂较少，但一致性要求高的场景。

image.png

Write-Around

image.png

相对于write-through，它是由应用程序将缓存写入数据库；配合cache-aside使用，更新也是由应用程序发起的，但是先写入数据库，再更新缓存（不能如write-through一样先更新缓存，后更新数据库；因为在更新操作来后，如果缓存未写入前其他进程/线程未命中缓存，就会查数据库的旧数据并可能覆盖数据）

image.png

Write-Back

write-through改良，就是多次更新都是会只更新到缓存，特定时间/特定次数时才会更新到数据库。能够在多修改的场景下降低数据库负担。缺点是缓存崩溃时为持久化数据会丢失

image.png

缓存选择

缓存的数据结构

哈希表(散列表)

根据key 获取/设置 value，时间复杂度为O(1)
redis里h开头的命令基本就是对哈希表的操作

image.png

集合类（数组/队列）

根据index获取/设置vale，根据index查找时间复杂度为O(1)，根据value查找时间复杂度为O(n)
ruby数组是数组与队列的结合，既可以使用index操作，也可以进行push,pop等操作
redis的list为双向链表，查找时间复杂度为o(n)

image.png

对可消费资源进行缓存，如消息队列等。特点是一次取一个值（非特定值）

有序集合类（搜索二叉树/跳跃表）

常用的搜索二叉树是红黑树，平衡了搜索二叉树的退化和平衡二叉树的维护开销。
跳跃表就是在链表的基础上增加了多级索引，从而实现查找时间复杂度为O(㏒n)
跳表对比红黑树

image.png

https://github.com/factoidforrest/dynamic-skiplist
redis的有序集合zset采用的就是跳表，zadd时附加score，查找时可按照score进行查找(zrangebyscore)

他肯定比hash要慢，但提供排序，以及按范围查找数据。业务场景如定时任务、排行榜等。
我们的业务我感觉在分配类业务时，可根据不同的维度使用不同的跳表；如分配公司时根据员工现有公司数、重点公司数等进行排序后分配，这部分数据可以缓存并在数据变化时更新score；分配时按条件取到符合的数据的交集，这样就减少了每次分配都要计算的时间消耗。

缓存淘汰策略

先进先出 /FIFO(First Input First Output)

ruby直接hash就可实现，先设置的 shift即移除。
缺点：太过简单粗暴，先缓存的哪怕后来比其他使用更多，时间更近依然会被淘汰。

最近最少使用/ LRU(Least Recently Used):

LRU算法又叫淘汰算法,根据数据历史访问记录进行淘汰数据,其核心思想是“如果数据最近被访问过,那么将来被访问的几率也更高”。
每次get时把最后访问的放到队尾，每次缓存满时移除队首缓存
缺点：未考虑命中率的问题
https://github.com/SamSaffron/lru_redux

最不经常使用/LFU(Least Frequently Used)

按照访问次数，最近最少使用的缓存数据，先淘汰。有多个最少使用的缓存数据，再按照LRU淘汰。
缺点：命中次数需要记录，计算，多占用内存和cpu。

# Observation

# LFU least frequently used
# key's frequency needs to be tracked

# get(key) -> exist -> freq ++
#          -> x -> -1
# put(k, v) -> exist -> modify the node val, freq ++
#           -> x -> store and evict (if size is full)
          
# maintain the HashMap of key the frequency, the value DDL of LRU within that frequency           
# 1. frequency map
# 2. node map
# get -> get node -> take that out of frequency map -> put that into the head of now appropriate frequency map list
# put -> (found) get node -> same as get
#     -> (new)   init node with frequency 1, put that into the map
#                -> (if full)
#                   freq_map[1], remove_from_tail
# {
#   1 => h -> 3 -> 2 -> t
#   2 => h -> 
# }

# ["LFUCache","put","put","get","put","get","get","put","get","get","get"]
# [[2],       [1,1],[2,2],[1],[3,3],[2],[3],[4,4],[1],[3],[4]]

Node = Struct.new(:key, :val, :usage, :next, :prev)
  
class DLinkedList
  def initialize
    @head = Node.new
    @tail = Node.new
    @head.next = @tail
    @tail.prev = @head
  end
  
  def empty?
    @head.next == @tail
  end
  
  def remove(node)
    node.prev.next = node.next
    node.next.prev = node.prev
    node.next = nil
    node.prev = nil
    node
  end
  
  def add_to_top(node)
    tmp = @head.next
    @head.next = node
    node.next = tmp
    tmp.prev = node
    node.prev = @head
  end
  
  def remove_from_tail
    raise if empty?
    remove(@tail.prev)
  end
end

class LFUCache
  def initialize(size)
    @size = size
    @memo = {} # key value map
    @freq_map = Hash.new { |h,k| h[k] = DLinkedList.new }
    @min_freq = nil
  end
  
  def get(key)
    return -1 if @size == 0 || @memo[key].nil?
    
    node = @memo[key]
    @freq_map[node.usage].remove(node)
    
    # if this has been recorded as min, needs to increment the min_freq lookup counter
    @min_freq += 1 if @freq_map[node.usage].empty? && @min_freq == node.usage
    
    node.usage += 1
    @freq_map[node.usage].add_to_top(node)
    node.val
  end
  
  def put(key, value)
    return if @size == 0
    
    if node = @memo[key]
      node.val = value
      get(key)
    else
      evict if @memo.keys.length == @size
      
      new_node = Node.new(key, value, 1)
      @freq_map[1].add_to_top(new_node)
      @min_freq = 1
      @memo[key] = new_node
    end
  end
  
  private
  
  def evict
    deleted = @freq_map[@min_freq].remove_from_tail
    @memo.delete(deleted.key)
  end
end

参考：https://cloud.tencent.com/developer/article/2077083

cache

什么是缓存

缓存设计模式

Cache-Aside

Read-Through Cache

Write-Through Cache

Write-Around

Write-Back

缓存选择

缓存的数据结构

哈希表(散列表)

集合类（数组/队列）

有序集合类（搜索二叉树/跳跃表）

缓存淘汰策略

先进先出 /FIFO(First Input First Output)

最近最少使用/ LRU(Least Recently Used):

最不经常使用/LFU(Least Frequently Used)

推荐阅读更多精彩内容