一、版本问题
目前官方虽说支持了spark2.2.1,下载git代码后编译完全是可以通过的,但是在使用过程会出现问题。按照目前所验证的结果是,spark2.1.0版本和carbondata1.3.1版本是可以正常使用的。
二、java.lang.NoClassDefFoundError:com/sun/jersey/api/client/config/ClientConfig
java.lang.NoClassDefFoundError: com/sun/jersey/api/client/config/ClientConfig at org.apache.hadoop.yarn.client.api.TimelineClient.createTimelineClient(TimelineClient.java:55) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createTimelineClient(YarnClientImpl.java:181) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:168) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:151) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56) at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156) at org.apache.spark.SparkContext.(SparkContext.scala:509) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2313) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:868) at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:860) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:860) at org.apache.spark.repl.Main$.createSparkSession(Main.scala:95) ... 47 elided Caused by: java.lang.ClassNotFoundException: com.sun.jersey.api.client.config.ClientConfig at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 61 more
解决方案:
将jersey-client-1.9.jar以及jersey-core-1.9.jar 拷贝到spark2/jars目录
三、Stack trace: ExitCodeException exitCode=1:
Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:944) at org.apache.hadoop.util.Shell.run(Shell.java:848) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1142) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:237) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:317) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:83) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) https://www.bbsmax.com/A/xl56xPNmzr/ Missing +/- setting for VM option 'OmitStackTraceInFastThrow' Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. Missing +/- setting for VM option 'UseGCOverheadLimit' Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit.
解决方案:
在spark-defaults.cfg的spark.executor.extraJavaOptions 项添加OmitStackTraceInFastThrow以及UseGCOverheadLimit参数
spark.executor.extraJavaOptions -Dcarbon.properties.filepath=carbon.properties -XX:+OmitStackTraceInFastThrow -XX:+UseGCOverheadLimit