二、深入理解OKHttp：缓存处理-CacheIntercepter

一、前言

【1.1】OkHttp系列其他篇章：

【1.2】陈述

OkHttp中提供了网络请求的缓存机制，当我们在上篇中追溯请求的流程时，知道每个Request都需要进过CacheInterceptor.process()的处理，但是整个缓存处理肯定是不止缓存拦截器的这一个方法的逻辑，它还涉及到：

Http缓存机制，对应CacheStrategy 类
LRUCache/DiskLRUCahche：对缓存进行高效增删改查。
okio：进行IO处理。
在开始 CacheInterceptor 的源码解析前，我们需要先了解 Http 缓存机制才能明白CacheStrategy类的存在意义。
而本篇只介绍Http的缓存机制和OkHttp中的处理。对于LruCache和okio以后会单拎开篇。

二、Http 缓存机制

【2.1】为什么需要缓存

让我们设想一个场景：一个用户一天内都打开多次某个页面，而这个页面的内容相对固定，并不是每次都更改。那么我们有必要每次都从服务器中下载资源吗？答案是不用的。此时缓存就排上用场了。上面场景只是缓存的其中的一个好处，合理的使用缓存还能有如下好处：

优化用户体验，避免空白页面的展示，提供默认数据展示。
避免不必要的访问服务器，减轻宽带负担。

【2.2】缓存分类之强制缓存

【2.2.1】简介： 一般地，当客户端向服务端请求时，按照是否重新向服务器发起请求来划分，那么有强制缓存和协商缓存两种缓存类型。他们的优劣势各不相同。而强制缓存：当缓存处在且未失效的条件下，直接使用缓存作为返回而且http返回的状态码为200，否则请求服务器。简要的请求流程如下：

image

【2.2.2】优缺点：

优点：加载速度快，性能好。
缺点：在缓存没失效前，都不会请求服务器。如果服务器此时更新了资源，客户端得不到最新的响应。

【2.2.3】相关请求头

Pragma： no-cache。代表禁用缓存，目前是在HTTP1.1中已被废弃。
Expires： GMT时间。代表改缓存的有效时间。可兼容HTTP/1.0和HTTP1.1。但是由于这个时间是服务器给的，会出现服务器和客户端时间不一致的问题。

【2.3】缓存分类之协商缓存

【2.3.1】简介： 协商缓存：当缓存存在时，带上用缓存标识先向服务器请求，服务器对比资源标识，如果不需要下发新资源，那么会直接返回304状态码，告诉客户端可用缓存；否则将新的资源和新的资源标识一起返回，此时的状态码为200。简要的请求流程如下：

image

【2.3.2】优缺点：

优点：减少服务器数据传输压力。能够及时更新数据。
缺点：每次都需要想服务器请求一次判断资源是否最新。

【2.3.3】相关请求头

Last-Modified/If-Modified-Since：xxx 在服务器首次请求回来的数据的请求头，附带了Last-Modified:xxx。这个时间值会在下次请求时，被附带在If-Modified-Since的请求值里。服务器对比两个值，如果一至就返回304状态码，告知客户端继续使用缓存。如果不一致，服务器返回新的Expires和Last-Modifed。缺点：Last-Modified只能精确到秒级，如果一个文件在1s内被更改，那么他们的值Last-Modified值是一样的，这会导致更新不到新资源问题。
ETag/If-Not-Match： 鉴于上面Last-Modified的缺点，增加了一个新的字段。服务器通过某种算法，对资源进行计算，比如MD5，然后赋值在Etag返回到客户端。客户端下次请求将值赋值到If-Not-Match/或者If-Match上，服务器进行比较，如果一致则直接返回304状态码，通知客户端可以使用缓存。如果需要更新，那么状态码为200，并返回整个新的资源。并且他们的优先级高于Last-Modified/If-Modified-Since

有了上面的一些Http缓存基本知识，接下来就可以跟随Okhttp的代码，来看看它是怎么处理缓存的了。

【2.4】缓存控制：CacheControl

【2.4.1】当它在请求头时

可选字段	意义
no-cache	不使用缓存，直接向服务器发起请求。
no-store	不储存缓存
max-age = xxx	告诉服务器，请求一个存在时间不超过xxx秒的资源
max-stale = xxx	告诉服务器，可接受一个超过缓存时间为xxx秒的资源，如果xxx秒没有定义，则时间为任意时间
min-fresh = xxx	告诉服务器，希望接收一个在小于xxx秒内被更新过的资源

【2.4.1】当它在响应头时

header 1	header 2
可选字段	意义
no-cache	不直接使用缓存，需要向服务器发起请求校验缓存。
no-store	服务器告诉客户端不缓存响应
no-transform	告知客户端在缓存响应时，不得对数据做改变
only-if-cached	告知客户端不进行网络请求，只使用缓存，如果缓存不命中，那么返回503状态码
Max-age=xxx	告知客户端，该响应在xxx秒内是合法的，不需要向服务器发起请求。
public	表示任何情况下都缓存该响应
private=“xxx”	表示xxx或者不指明是为全部，值对部分用户做缓存。

三、OkHttp的缓存处理

【3.1】CacheInterceptor：缓存开始的地方

CacheInterceptor.java
@Override public Response intercept(Chain chain) throws IOException {
    //1.从LruCache中，根据Request，取出缓存的Response
    Response cacheCandidate = cache != null
        ? cache.get(chain.request())
        : null;

    long now = System.currentTimeMillis();

    //2.缓存选择策略类，根据request和response来决定需不需要使用缓存。
    //new CacheStrategy.Factory() 详见：【3.2】
    //CacheStrategy.Factory.get() 详见：【3.3】
    CacheStrategy strategy = new CacheStrategy.Factory(now, chain.request(), cacheCandidate).get();
    //缓存策略逻辑执行后的产物，主要根据这两个对象判断是否使用缓存等。
    Request networkRequest = strategy.networkRequest;
    Response cacheResponse = strategy.cacheResponse;

    //3.缓存跟踪记录缓存策略选择后的结果。
    if (cache != null) {
      cache.trackResponse(strategy);
    }

    //4.缓存数据库里的响应缓存不为空，但是结果缓存策略选择后的结果为空
    //证明这个响应缓存已经过时不适用了，将起关闭，防止内存泄露。后续的操作中也会将不用的Response进行关闭，就不一一赘述。
    if (cacheCandidate != null && cacheResponse == null) {
      closeQuietly(cacheCandidate.body()); 
    }

    // 5.如果禁用了网络，此时request为空，而缓存的响应也为空，直接504的响应
    if (networkRequest == null && cacheResponse == null) {
      return new Response.Builder()
          .request(chain.request())
          .protocol(Protocol.HTTP_1_1)
          .code(504)
          .message("Unsatisfiable Request (only-if-cached)")
          .body(Util.EMPTY_RESPONSE)
          .sentRequestAtMillis(-1L)
          .receivedResponseAtMillis(System.currentTimeMillis())
          .build();
    }

    // 6.如果不需要网络，且缓存的响应有效，返回这个缓存的响应。
    if (networkRequest == null) {
      return cacheResponse.newBuilder()
          .cacheResponse(stripBody(cacheResponse))
          .build();
    }

    //7.到这一步，说明需要执行真正的网络请求，得到网络的响应了，所以执行下一个拦截器逻辑。
    Response networkResponse = null;
    try {
      networkResponse = chain.proceed(networkRequest);
    } finally {
      ...
    }

    // 8.如果缓存的Response不为空，此时要综合网络返回回来的Respnse进行选择。
    if (cacheResponse != null) {
      //当服务器告诉客户端数据没有改变时，客户端直接使用缓存的Response。
      //但是会更新最新的一些请求头等数据到缓存的Response。
      if (networkResponse.code() == HTTP_NOT_MODIFIED) {
        Response response = cacheResponse.newBuilder()
            .headers(combine(cacheResponse.headers(), networkResponse.headers()))
            .sentRequestAtMillis(networkResponse.sentRequestAtMillis())
            .receivedResponseAtMillis(networkResponse.receivedResponseAtMillis())
            .cacheResponse(stripBody(cacheResponse))
            .networkResponse(stripBody(networkResponse))
            .build();
            ...
        return response;
      } else {
        closeQuietly(cacheResponse.body());
      }
    }

    //9.到这一步，确定使用网络的Response。
    Response response = networkResponse.newBuilder()
        .cacheResponse(stripBody(cacheResponse))
        .networkResponse(stripBody(networkResponse))
        .build();

    //10.缓存新的Response
    if (cache != null) {
      if (HttpHeaders.hasBody(response) && CacheStrategy.isCacheable(response, networkRequest)) {
        // Offer this request to the cache.
        CacheRequest cacheRequest = cache.put(response);
        return cacheWritingResponse(cacheRequest, response);
      }

      if (HttpMethod.invalidatesCache(networkRequest.method())) {
        try {
          cache.remove(networkRequest);
        } catch (IOException ignored) {
          // The cache cannot be written.
        }
      }
    }
    //返回最新的Respnse。
    return response;
  }

总结：总的来说，CacheIntercepter根据缓存策略选择出来的Request和Response来决定是否用缓存，和缓存的更新。详细的，它做了如下事情：

尝试获取缓存的Response。
将Request和Resonse投放进CacheStrategy，得到要进行网络请求的netRequest和缓存的响应cacheResponse。
缓存检测上述得到的2个实体。
如果(禁用网络&&没有缓存)，直接返回504的响应。
如果(禁用网络&&有缓存)，使用缓存的响应。
缓存无效，进行网络请求，得到最新的netReponse。
如果本地存在缓存，检查netResponse的响应是否为不需要更新。如果是将netResponse的一些响应头等数据更新到cacheResonse并返回缓存的响应。相应的做LRUCache的命中记录。
如果7没有返回，代表用最新的netResonse作为结果。那么此时更新最新的响应到缓存中，并返回。

【3.2】new CacheStrategy.Factory()

CacheStrategy.Factory.java
 public Factory(long nowMillis, Request request, Response cacheResponse) {
      this.nowMillis = nowMillis;
      this.request = request;
      this.cacheResponse = cacheResponse;

      if (cacheResponse != null) {
        this.sentRequestMillis = cacheResponse.sentRequestAtMillis();
        this.receivedResponseMillis = cacheResponse.receivedResponseAtMillis();
        Headers headers = cacheResponse.headers();
        for (int i = 0, size = headers.size(); i < size; i++) {
          String fieldName = headers.name(i);
          String value = headers.value(i);
          if ("Date".equalsIgnoreCase(fieldName)) {
            servedDate = HttpDate.parse(value);
            servedDateString = value;
          } else if ("Expires".equalsIgnoreCase(fieldName)) {
            expires = HttpDate.parse(value);
          } else if ("Last-Modified".equalsIgnoreCase(fieldName)) {
            lastModified = HttpDate.parse(value);
            lastModifiedString = value;
          } else if ("ETag".equalsIgnoreCase(fieldName)) {
            etag = value;
          } else if ("Age".equalsIgnoreCase(fieldName)) {
            ageSeconds = HttpHeaders.parseSeconds(value, -1);
          }
        }
      }

总结： 记录起request和缓存的Resonse，然后解析出cacheReonse响应头里面有关缓存的键值对并保存起来。之前在一中讲到的如ETag、Last-Modified等在这里就出现了。

【3.3】Factory.get()


CacheStrategy.Factory.java
public CacheStrategy get() {
  【详见3.4】获取候选的请求和缓存响应。
  CacheStrategy candidate = getCandidate();

  /**这里如果networkRequest != null 代表缓存不可用，需要进行网络请求。
  但是需要检查cacheControl是否指明了只是用缓存，不用网络。如果是的话，此时综合2个判断，可以得出请求失败。
  而netWorkRequest = null && 擦车 Response = null 的处理结果我们可以在【3.1】的5中可以看到，返回的是504.
  */
  if (candidate.networkRequest != null && request.cacheControl().onlyIfCached()) {
    return new CacheStrategy(null, null);
  }

  return candidate;
}

总结： 这里做获取候选的请求和cacheResponse。并且判断如果需要进行网络请求，并且在只用缓存的情况下，缓存不可用，那么直接返回2个空的候选结果。这里的CacheControl是对于Http里cacheControl请求头字段的描述。

【3.4】获取候选响应：CacheStrategy.Factory.getCandidate()


CacheStrategy.Factory.java
 private CacheStrategy getCandidate() {
  //1. 该request没有缓存的响应，那么返回没有缓存的策略
  if (cacheResponse == null) {
    return new CacheStrategy(request, null);
  }

  //2. 如果请求是Https，并且缓存的握手已经丢失，那么也返回一个没有缓存的策略。
  if (request.isHttps() && cacheResponse.handshake() == null) {
    return new CacheStrategy(request, null);
  }

  //3. 这里进行响应是否可缓存的判断。
  if (!isCacheable(cacheResponse, request)) {
    return new CacheStrategy(request, null);
  }

  //4.如果request要求不用缓存;
  //或者请求里面带有上次请求时服务器返回的"If-Modified-Since" || "If-None-Match"
  //那么他们需要返回一个没有缓存的策略。
  CacheControl requestCaching = request.cacheControl();
  if (requestCaching.noCache() || hasConditions(request)) {
    return new CacheStrategy(request, null);
  }
    

  
  CacheControl responseCaching = cacheResponse.cacheControl();
  //获得响应的年龄
  long ageMillis = cacheResponseAge();
  //获得最近刷新时间
  long freshMillis = computeFreshnessLifetime();

  //最近刷新时间还需要结合请求头里的最大年龄时间，取中间最小值。
  if (requestCaching.maxAgeSeconds() != -1) {
    freshMillis = Math.min(freshMillis, SECONDS.toMillis(requestCaching.maxAgeSeconds()));
  }

  //获得请求里的最小刷新时间。
  long minFreshMillis = 0;
  if (requestCaching.minFreshSeconds() != -1) {
    minFreshMillis = SECONDS.toMillis(requestCaching.minFreshSeconds());
  }

  //获得服务器的最大验证秒数，如果有的话
  long maxStaleMillis = 0;
  if (!responseCaching.mustRevalidate() && requestCaching.maxStaleSeconds() != -1) {
    maxStaleMillis = SECONDS.toMillis(requestCaching.maxStaleSeconds());
  }


  //5.在缓存响应可被缓存的条件下
  //如果满足（cacheResponse的年龄+最小刷新时间）<（最近刷新时间+最大验证秒数）那么可以不用进行网络请求而直接用cacheResonse。
  if (!responseCaching.noCache() && ageMillis + minFreshMillis < freshMillis + maxStaleMillis) {
    Response.Builder builder = cacheResponse.newBuilder();
    if (ageMillis + minFreshMillis >= freshMillis) {
      builder.addHeader("Warning", "110 HttpURLConnection \"Response is stale\"");
    }
    long oneDayMillis = 24 * 60 * 60 * 1000L;
    if (ageMillis > oneDayMillis && isFreshnessLifetimeHeuristic()) {
      builder.addHeader("Warning", "113 HttpURLConnection \"Heuristic expiration\"");
    }
    //返回一个不用网络请求，直接用cacheResponse的策略，这时进行强制缓存。
    return new CacheStrategy(null, builder.build());
  }

  /**
  * 6.到这里，进行协商缓存的判断。可以看到它们的优先级是：etag>lastModified>serverDate。
  */
  String conditionName;
  String conditionValue;
  if (etag != null) {
    conditionName = "If-None-Match";
    conditionValue = etag;
  } else if (lastModified != null) {
    conditionName = "If-Modified-Since";
    conditionValue = lastModifiedString;
  } else if (servedDate != null) {
    conditionName = "If-Modified-Since";
    conditionValue = servedDateString;
  } else {
    //如果如果不存在以上请求头，那么它将进行一个常规的网络请求。
    return new CacheStrategy(request, null); 
  }
  
  //如果存在上述说的请求头，那么将他加进入请求里面。
  Headers.Builder conditionalRequestHeaders = request.headers().newBuilder();
  Internal.instance.addLenient(conditionalRequestHeaders, conditionName, conditionValue);

  Request conditionalRequest = request.newBuilder()
      .headers(conditionalRequestHeaders.build())
      .build();
  // 7.返回一个需要进行网络请求并且存在缓存响应的策略，此时他们将进行协商缓存
  return new CacheStrategy(conditionalRequest, cacheResponse);
}

总结： 缓存策略中，针对请求头和缓存响应头的一些值，进行缓存策略的选择，并返回。总的来说他们做了如下判断：

该请求对应的响应没有缓存：返回无缓存响应策略。
该请求的cacheRsponse已经丢失握手：返回无缓存响应策略。
cacheResponse是不能缓存的类型，比如响应的响应码不符合规则，或则头部存在"noStore"等情况，这部分可参照isCacheable()方法，这里不赘述。：返回无缓存响应策略。
满足：（cacheResponse的年龄+最小刷新时间）<（最近刷新时间+最大验证秒数）：返回直接用缓存策略。
该请求没有协商缓存的相关头部：返回常规网络请求策略。
存在协商缓存相关头部：返回request+cacheResponse策略。

netWorkRequest和cacheResponse的不同组合情况得出的结果如下表：

netWorkRequest	cacheResponse	表现结果
空	空	不用网络且缓存不可用，直接返回503
空	不为空	不走网络，直接返回缓存
不为空	空	缓存不可用，常规请求。
不为空	不为空	需要协商缓存，进行网络访问，进一步确认。

本篇小节：在本篇中，我们按照是否需要向服务器请求，介绍了Http缓存的2种缓存方式：强制缓存和协商缓存。然后从CacheInterceptor.java入手，输入理解了Okhttp在缓存逻辑中做的一些事情：CacheStrategy获取缓存，CacheInterceptor根据缓存策略获得的候选Request和Response作出响应的逻辑处理，或是返回缓存响应，或是返回错误，或是进行协商缓存等