ignite经常每隔数小时就挂掉,手动重启也经常出现无法启动成功
手动重启报错日志:
Caused by: class org.apache.ignite.spi.IgniteSpiException: Node with the same ID was found in node IDs history or existing node in topology has the same ID
(fix configuration and restart local node)
运行数小时后失败日志:
[06:15:36,458][WARNING][tcp-disco-ip-finder-cleaner-#33-#92][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is long GC pauses on remote node) [curTimeout=9998, rmtAddr=/10.42.2.181:47500, rmtPort=47500]
[06:15:36,458][WARNING][tcp-disco-ip-finder-cleaner-#33-#92][TcpDiscoverySpi] Failed to ping node [nodeId=null]. Reached the timeout 10000ms. Cause: Failed to deserialize object with given class loader: sun.misc.Launcher$AppClassLoader@75b84c92
[06:16:28,666][WARNING][tcp-disco-msg-worker-[1483ea3b 10.42.5.149:47500 crd]-#2-#60][TcpDiscoverySpi] Node is out of topology (probably, due to short-time network problems).
初步判断,ignite中数据量暴增后出现网络问题,增加超时时间
在ignite的配置文件中,在IgniteConfiguration bean,中,添加以下参数:
<property name="failureDetectionTimeout" value="30000"/>
<property name="clientFailureDetectionTimeout" value="50000"/>