前几天公司后端系统出现了故障,导致app多个功能无法使用,查看日志,发现日志出现较多的redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool的异常信息,显而易见,jedis/redis出现了问题。因为是connection的相关的问题,所以看了一下jedis和连接数相关的配置项,maxIdle和maxTotal都是200,jedis的封装也在finally中释放了connection,所以初步猜测问题发生在redis服务端
1.jedis机器-->ping-->redis机器,毫秒级的响应时间----网络畅通
2.使用netstat -apn |grep redis-server连接数为20多个--网络连接数正常
3.free -m内存使用率60%---(表面上)内存够用
4.df -h磁盘使用率15%---磁盘空间充足
5.使用redis-cli,执行info命令,client部分:
#Clients
connected_clients:18
client_longest_output_list:0
client_biggest_input_buf:0
blocked_clients:0
clients数量也正常
6.使用redis-cli,执行ping命令,异常信息出来了:
(error)MISCONF Redis is configured to save RDB snapshots, but is currently
not able to persist on disk. Commands that may modify the data set
are disabled. Please check Redis logs for details about the error.
然后查看redis日志,出现了
WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
问题已经很清晰了,bgsave会fork一个子进程,因为vm.overcommit_memory = 0,所以申请的内存大小和父进程的一样,由于redis已经使用了60%的内存空间,所以fork失败
解决办法:
/etc/sysctl.conf 添加 vm.overcommit_memory=1
sysctl vm.overcommit_memory=1