jedis客户端网络故障期间redis发生master-slave切换问题

客户端控制台一直报一下异常:

redis.clients.jedis.exceptions.JedisDataException: READONLY You can't write against a read only slave.

从字面意思理解的话,意思是写到了只读的从节点。正常情况下是不会出现这个问题的。只会发生在master已经切换,但是client使用的master没有刷新的情况。

可以看下jedis的源码(2.9版本):

protected class MasterListener extends Thread {
省略一些代码...
 @Override
    public void run() {

      running.set(true);

      while (running.get()) {

        j = new Jedis(host, port);

        try {
          // double check that it is not being shutdown
          if (!running.get()) {
            break;
          }

          j.subscribe(new JedisPubSub() {
            @Override
            public void onMessage(String channel, String message) {
              log.fine("Sentinel " + host + ":" + port + " published: " + message + ".");

              String[] switchMasterMsg = message.split(" ");

              if (switchMasterMsg.length > 3) {

                if (masterName.equals(switchMasterMsg[0])) {
 //正常来说发生master切换,client会收到switchmaster消息,从而重新初始化连接池。                 initPool(toHostAndPort(Arrays.asList(switchMasterMsg[3], switchMasterMsg[4])));
                } else {
                  log.fine("Ignoring message on +switch-master for master name "
                      + switchMasterMsg[0] + ", our master name is " + masterName);
                }

              } else {
                log.severe("Invalid message received on Sentinel " + host + ":" + port
                    + " on channel +switch-master: " + message);
              }
            }
          }, "+switch-master");

        } catch (JedisConnectionException e) {

          if (running.get()) {
            log.log(Level.SEVERE, "Lost connection to Sentinel at " + host + ":" + port
                + ". Sleeping 5000ms and retrying.", e);
            try {
              Thread.sleep(subscribeRetryWaitTimeMillis);
            } catch (InterruptedException e1) {
              log.log(Level.SEVERE, "Sleep interrupted: ", e1);
            }
          } else {
            log.fine("Unsubscribing from Sentinel at " + host + ":" + port);
          }
        } finally {
          j.close();
        }
      }
    }

伴随着往slave节点写的错误还可以看到:

 log.log(Level.SEVERE, "Lost connection to Sentinel at " + host + ":" + port
                + ". Sleeping 5000ms and retrying.", e);

这个错误日志,说明曾经客户端丢失了连接。

查看jedis github相关isuse,果真有相关内容。可以参考:

https://github.com/xetorthio/jedis/pull/1566

https://github.com/xetorthio/jedis/issues/1953

这个问题在2.9.3版本已经得到正确修复:

 @Override
    public void run() {

      running.set(true);

      while (running.get()) {

        j = new Jedis(host, port);

        try {
          // double check that it is not being shutdown
          if (!running.get()) {
            break;
          }
          
          /* 这里就是关键,只要while循环一次则重新初始化一次。正常情况是会阻塞在 j.subscribe 所以不会一直执行。
           * Added code for active refresh
           */
          List<String> masterAddr = j.sentinelGetMasterAddrByName(masterName);  
          if (masterAddr == null || masterAddr.size() != 2) {

            log.warning("Can not get master addr, master name: "+ masterName+". Sentinel: "+host+":"+port+".");
          }else{
              initPool(toHostAndPort(masterAddr)); 
          }
          //正常情况下会阻塞在这里
          j.subscribe(new JedisPubSub() {
            @Override
            public void onMessage(String channel, String message) {
              log.fine("Sentinel " + host + ":" + port + " published: " + message + ".");

              String[] switchMasterMsg = message.split(" ");

              if (switchMasterMsg.length > 3) {

                if (masterName.equals(switchMasterMsg[0])) {
                  initPool(toHostAndPort(Arrays.asList(switchMasterMsg[3], switchMasterMsg[4])));
                } else {
                  log.fine("Ignoring message on +switch-master for master name "
                      + switchMasterMsg[0] + ", our master name is " + masterName);
                }

              } else {
                log.severe("Invalid message received on Sentinel " + host + ":" + port
                    + " on channel +switch-master: " + message);
              }
            }
          }, "+switch-master");

        } catch (JedisException e) {

          if (running.get()) {
            log.log(Level.SEVERE, "Lost connection to Sentinel at " + host + ":" + port
                + ". Sleeping 5000ms and retrying.", e);
            try {
              Thread.sleep(subscribeRetryWaitTimeMillis);
            } catch (InterruptedException e1) {
              log.log(Level.SEVERE, "Sleep interrupted: ", e1);
            }
          } else {
            log.fine("Unsubscribing from Sentinel at " + host + ":" + port);
          }
        } finally {
          j.close();
        }
      }
    }
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • NOSQL类型简介键值对:会使用到一个哈希表,表中有一个特定的键和一个指针指向特定的数据,如redis,volde...
    MicoCube阅读 9,481评论 2 27
  • redis集群分为服务端集群和客户端分片,redis3.0以上版本实现了集群机制,即服务端集群,3.0以下使用客户...
    hadoop_null阅读 5,478评论 0 6
  • 我终是决定写些什么,以缅怀昨日的旧时光。 这些年我看着一些人来了又去,日历翻了又翻,他们或是它们与...
    杨璇璇璇阅读 1,800评论 0 0
  • 哪怕眼前漆黑一片,你也要看到生的希望!
    追梦的佩璇阅读 1,599评论 0 0
  • 你想成为想法创意源源不断的人吗?当别人绞尽脑汁的,依然只能说出几个干巴巴的想法时,你能够天马行空的提出各种设想,成...
    健健大侠阅读 4,775评论 0 50