cluster 模式提交 StartUp (shade)包到 spark 集群执行:
首先报错:
java.lang.NoClassDefFoundError: Could not initialize class XXX
在类的初始化的时候出了问题,而类中有几个静态变量
/* other static final variable */
private static final Set<Object> ASYNC_OBJECT = Sets.newConcurrentHashSet();
于是,就把这些静态常量的初始化工作放到一个static代码块中,并尝试捕捉其异常:
public class LockDataManager {
/* other static final variable */
private static final Set<Object> ASYNC_OBJECT;
static {
try {
/* other initial */
ASYNC_OBJECT = Sets.newConcurrentHashSet();
} catch (Throwable throwable) {
LoggerUtils.info("init error at LockDataManager.class");
LoggerUtils.error(throwable);
throw throwable;
}
}
/* ... */
}
重新打包,提交并运行,查看日志:
java.lang.NoSuchMethodError: com.google.common.collect.Sets.newConcurrentHashSet()Ljava/util/Set;
NoSuchMethodError 一般是由于环境问题,比如集群环境中可能有更低版本的 guava 包,该版本包中没有 newConcurrentHashSet 方法。去集群看:
[hadoop@bigdata0 spark-3.0.0-bin-hadoop3.2]$ ll jars/ | grep guava
-rw-r--r-- 1 hadoop hadoop 2189117 6月 6 2020 guava-14.0.1.jar
在启动时加入以下代码验证:
String LOCATION = "";
String URLLOCATION = "";
try {
LOCATION = Sets.class.getProtectionDomain().getCodeSource().getLocation().getFile();
URLLOCATION = URLDecoder.decode(LOCATION, "UTF-8");
} catch (UnsupportedEncodingException e) {
LoggerUtils.error(Main.class, e);
}
LoggerUtils.info("***loc=" + LOCATION + "\nURLLoc=" + URLLOCATION);
再次打包,提交并运行,查看日志:
***loc=/opt/software/spark-3.0.0-bin-hadoop3.2/jars/guava-14.0.1.jar
URLLoc=/opt/software/spark-3.0.0-bin-hadoop3.2/jars/guava-14.0.1.jar
于是,找到问题根源并验证了。