Spark 读取 Hive 数据及相关问题解决

  1. 示例代码
    1. SparkHiveAPP 主类

      注意:
      需要将 core-site.xml,hdfs-site.xml, yarn-site.xml,mapred-site.xml 和 hive-site.xml 放到 resource 下面,程序运行的时候需要这些环境。

      import org.apache.log4j.{Level, Logger}
      import org.apache.spark.SparkConf
      import org.apache.spark.sql.SparkSession
      
      object SparkHiveAPP {
      
        def main(args: Array[String]): Unit = {
      
          Logger.getLogger("org").setLevel(Level.WARN)
          
          /**
            * 不设置 System.setProperty("HADOOP_USER_NAME", "root") 会出现异常
            * org.apache.hadoop.security.AccessControlException: Permission denied
            */
          System.setProperty("HADOOP_USER_NAME", "root")
          val conf = new SparkConf()
            .setIfMissing("spark.master", "local[2]")
            .set("spark.sql.warehouse.dir", "/user/hive/warehouse")
            .setAppName("Spark_Hive_APP")
      
          val spark: SparkSession = SparkSession.builder().config(conf)
            .enableHiveSupport()
            .getOrCreate()
      
          spark.sparkContext.setLogLevel("WARN")
      
          spark.sql("SELECT * FROM test.test1").show()
      
        }
      }
      
    2. pom.xml 文件
      <?xml version="1.0" encoding="UTF-8"?>
      <project xmlns="http://maven.apache.org/POM/4.0.0"
               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
               xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
          <modelVersion>4.0.0</modelVersion>
          <groupId>com.cloudera</groupId>
          <artifactId>RemoteSubmitSparkToYarn</artifactId>
          <version>1.0-SNAPSHOT</version>
      
          <packaging>jar</packaging>
          <name>RemoteSubmitSparkToYarn</name>
      
          <repositories>
              <!-- cloudera 的仓库 -->
              <repository>
                  <id>cloudera</id>
                  <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
                  <name>Cloudera Repositories</name>
                  <releases>
                      <enabled>true</enabled>
                  </releases>
                  <snapshots>
                      <enabled>false</enabled>
                  </snapshots>
              </repository>
          </repositories>
      
          <properties>
              <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
              <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
              <java.version>1.8</java.version>
              <scala.version>2.11.12</scala.version>
              <hbase.version>1.3.0</hbase.version>
              <!--<spark.version>2.4.0-cdh6.1.1</spark.version>-->
              <hive.version>1.2.0</hive.version>
              <kafka.version>0.10.0.1</kafka.version>
              <spark.version>2.2.0</spark.version>
              <kafka.scope>compile</kafka.scope>
              <provided.scope>compile</provided.scope>
          </properties>
      
          <dependencies>
      
              <!-- HBsae -->
              <!--<dependency>-->
              <!--<groupId>org.apache.hbase</groupId>-->
              <!--<artifactId>hbase-client</artifactId>-->
              <!--<version>${hbase.version}</version>-->
              <!--</dependency>-->
              <!--<dependency>-->
              <!--<groupId>org.apache.hbase</groupId>-->
              <!--<artifactId>hbase-server</artifactId>-->
              <!--<version>${hbase.version}</version>-->
              <!--<scope>${provided.scope}</scope>-->
              <!--</dependency>-->
      
              <!-- scala -->
              <dependency>
                  <groupId>org.scala-lang</groupId>
                  <artifactId>scala-library</artifactId>
                  <version>${scala.version}</version>
                  <scope>${provided.scope}</scope>
              </dependency>
              <dependency>
                  <groupId>org.scala-lang</groupId>
                  <artifactId>scala-compiler</artifactId>
                  <version>${scala.version}</version>
                  <scope>${provided.scope}</scope>
              </dependency>
              <dependency>
                  <groupId>org.scala-lang</groupId>
                  <artifactId>scala-reflect</artifactId>
                  <version>${scala.version}</version>
                  <scope>${provided.scope}</scope>
              </dependency>
              <dependency>
                  <groupId>org.apache.spark</groupId>
                  <artifactId>spark-core_2.11</artifactId>
                  <version>${spark.version}</version>
                  <exclusions>
                      <exclusion>
                          <groupId>org.glassfish.jersey.bundles.repackaged</groupId>
                          <artifactId>jersey-guava</artifactId>
                      </exclusion>
                  </exclusions>
                  <scope>${provided.scope}</scope>
              </dependency>
              <dependency>
                  <groupId>org.apache.spark</groupId>
                  <artifactId>spark-streaming_2.11</artifactId>
                  <version>${spark.version}</version>
                  <scope>${provided.scope}</scope>
              </dependency>
              <dependency>
                  <groupId>org.apache.spark</groupId>
                  <artifactId>spark-sql_2.11</artifactId>
                  <version>${spark.version}</version>
                  <scope>${provided.scope}</scope>
              </dependency>
              <dependency>
                  <groupId>org.apache.spark</groupId>
                  <artifactId>spark-hive_2.11</artifactId>
                  <version>${spark.version}</version>
                  <scope>${provided.scope}</scope>
              </dependency>
              <dependency>
                  <groupId>org.apache.hive</groupId>
                  <artifactId>hive-exec</artifactId>
                  <version>${hive.version}</version>
              </dependency>
              <dependency>
                  <groupId>org.apache.spark</groupId>
                  <artifactId>spark-yarn_2.11</artifactId>
                  <version>${spark.version}</version>
                  <scope>${provided.scope}</scope>
              </dependency>
              <dependency>
                  <groupId>org.apache.spark</groupId>
                  <artifactId>spark-sql-kafka-0-10_2.11</artifactId>
                  <version>${spark.version}</version>
                  <scope>${provided.scope}</scope>
              </dependency>
              <dependency>
                  <groupId>org.apache.spark</groupId>
                  <artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
                  <version>${spark.version}</version>
                  <scope>${provided.scope}</scope>
              </dependency>
              <dependency>
                  <groupId>org.apache.kafka</groupId>
                  <artifactId>kafka_2.11</artifactId>
                  <version>${kafka.version}</version>
                  <scope>${kafka.scope}</scope>
              </dependency>
              <dependency>
                  <groupId>org.apache.kafka</groupId>
                  <artifactId>kafka-clients</artifactId>
                  <version>0.10.0.1</version>
                  <scope>${kafka.scope}</scope>
              </dependency>
          </dependencies>
      
          <build>
              <pluginManagement>
                  <plugins>
                      <plugin>
                          <groupId>org.apache.maven.plugins</groupId>
                          <artifactId>maven-compiler-plugin</artifactId>
                          <version>3.8.0</version>
                          <configuration>
                              <source>1.8</source>
                              <target>1.8</target>
                          </configuration>
                      </plugin>
                      <plugin>
                          <groupId>org.apache.maven.plugins</groupId>
                          <artifactId>maven-resources-plugin</artifactId>
                          <version>3.0.2</version>
                          <configuration>
                              <encoding>UTF-8</encoding>
                          </configuration>
                      </plugin>
                      <plugin>
                          <groupId>net.alchim31.maven</groupId>
                          <artifactId>scala-maven-plugin</artifactId>
                          <version>3.2.2</version>
                          <executions>
                              <execution>
                                  <goals>
                                      <goal>compile</goal>
                                      <goal>testCompile</goal>
                                  </goals>
                              </execution>
                          </executions>
                      </plugin>
                      <plugin>
                          <groupId>org.apache.maven.plugins</groupId>
                          <artifactId>maven-resources-plugin</artifactId>
                          <version>3.0.2</version>
                          <configuration>
                              <encoding>UTF-8</encoding>
                          </configuration>
                      </plugin>
                  </plugins>
              </pluginManagement>
              <plugins>
                  <plugin>
                      <groupId>net.alchim31.maven</groupId>
                      <artifactId>scala-maven-plugin</artifactId>
                      <executions>
                          <execution>
                              <id>scala-compile-first</id>
                              <phase>process-resources</phase>
                              <goals>
                                  <goal>add-source</goal>
                                  <goal>compile</goal>
                              </goals>
                          </execution>
                          <execution>
                              <id>scala-test-compile</id>
                              <phase>process-test-resources</phase>
                              <goals>
                                  <goal>testCompile</goal>
                              </goals>
                          </execution>
                      </executions>
                  </plugin>
      
                  <plugin>
                      <groupId>org.apache.maven.plugins</groupId>
                      <artifactId>maven-compiler-plugin</artifactId>
                      <executions>
                          <execution>
                              <phase>compile</phase>
                              <goals>
                                  <goal>compile</goal>
                              </goals>
                          </execution>
                      </executions>
                  </plugin>
      
                  <plugin>
                      <groupId>org.apache.maven.plugins</groupId>
                      <artifactId>maven-shade-plugin</artifactId>
                      <version>2.4.3</version>
                      <executions>
                          <execution>
                              <phase>package</phase>
                              <goals>
                                  <goal>shade</goal>
                              </goals>
                              <configuration>
                                  <filters>
                                      <filter>
                                          <artifact>*:*</artifact>
                                          <excludes>
                                              <exclude>META-INF/*.SF</exclude>
                                              <exclude>META-INF/*.DSA</exclude>
                                              <exclude>META-INF/*.RSA</exclude>
                                          </excludes>
                                      </filter>
                                  </filters>
                              </configuration>
                          </execution>
                      </executions>
                  </plugin>
              </plugins>
              <resources>
                  <resource>
                      <directory>${basedir}/src/main/resources</directory>
                      <excludes>
                          <exclude>env/*/*</exclude>
                      </excludes>
                      <includes>
                          <include>**/*</include>
                      </includes>
                  </resource>
                  <resource>
                      <directory>${basedir}/src/main/resources/env/${profile.active}</directory>
                      <includes>
                          <include>**/*.properties</include>
                          <include>**/*.xml</include>
                      </includes>
                  </resource>
              </resources>
          </build>
          <profiles>
              <profile>
                  <id>dev</id>
                  <properties>
                      <profile.active>dev</profile.active>
                  </properties>
                  <activation>
                      <activeByDefault>true</activeByDefault>
                  </activation>
              </profile>
              <profile>
                  <id>test</id>
                  <properties>
                      <profile.active>test</profile.active>
                  </properties>
              </profile>
              <profile>
                  <id>prod</id>
                  <properties>
                      <profile.active>prod</profile.active>
                  </properties>
              </profile>
          </profiles>
      </project>
      
    3. 运行结果
      Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
      18/06/27 10:30:40 INFO metastore: Trying to connect to metastore with URI thrift://cdh01:9083
      18/06/27 10:30:41 WARN ShellBasedUnixGroupsMapping: got exception trying to get groups for user root: GetLocalGroupsForUser error (1332): ?????????????????
      
      
      
      18/06/27 10:30:41 WARN UserGroupInformation: No groups available for user root
      18/06/27 10:30:41 INFO metastore: Connected to metastore.
      18/06/27 10:30:42 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      18/06/27 10:30:42 WARN UserGroupInformation: No groups available for user root
      18/06/27 10:30:42 WARN UserGroupInformation: No groups available for user root
      18/06/27 10:30:42 WARN UserGroupInformation: No groups available for user root
      18/06/27 10:30:42 WARN UserGroupInformation: No groups available for user root
      +---+--------+------------+
      | id|    name|       hobby|
      +---+--------+------------+
      |  1|zhangsan|[唱歌, 跳舞, 游泳]|
      |  2|    lisi|   [打游戏, 篮球]|
      |  3|  wangwu|    [唱歌, 游泳]|
      +---+--------+------------+
      
      
      Process finished with exit code 0
      
  1. 遇到的问题
    1. 本地找不到未 winutils 二进制文件

      问题日志:

      18/06/27 10:35:18 ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
      java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
          at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:378)
          at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:393)
          at org.apache.hadoop.util.Shell.getGroupsForUserCommand(Shell.java:163)
          at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:84)
          at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:52)
          at org.apache.hadoop.security.Groups$GroupCacheLoader.fetchGroupList(Groups.java:231)
          at org.apache.hadoop.security.Groups$GroupCacheLoader.load(Groups.java:211)
          at org.apache.hadoop.security.Groups$GroupCacheLoader.load(Groups.java:199)
          at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3524)
          at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2317)
          at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2280)
          at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2195)
          at com.google.common.cache.LocalCache.get(LocalCache.java:3934)
          at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3938)
          at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4821)
          at org.apache.hadoop.security.Groups.getGroups(Groups.java:173)
          at org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1552)
          at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:436)
          at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:236)
          at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
          at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
          at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
          at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
          at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
          at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
          at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
          at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
          at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
          at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
          at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
          at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234)
          at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174)
          at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166)
          at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
          at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:191)
          at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
          at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
          at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
          at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
          at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
          at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:362)
          at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:266)
          at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:66)
          at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:65)
          at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:194)
          at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194)
          at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194)
          at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
          at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:193)
          at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:105)
          at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:93)
          at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39)
          at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54)
          at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52)
          at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:35)
          at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:289)
          at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1050)
          at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130)
          at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130)
          at scala.Option.getOrElse(Option.scala:121)
          at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:129)
          at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:126)
          at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:938)
          at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:938)
          at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130)
          at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130)
          at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
          at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
          at scala.collection.mutable.HashMap.foreach(HashMap.scala:130)
          at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:938)
          at com.cloudera.SparkHiveAPP$.main(SparkHiveAPP.scala:24)
          at com.cloudera.SparkHiveAPP.main(SparkHiveAPP.scala)
      

      解决办法:

      1. 下载 winutils 文件。下载地址: https://github.com/steveloughran/winutils

      2. 设置环境变量 HADOOP_HOME 。
        在本地机器中配置: HADOOP_HOME=D:\winutils-master\hadoop-2.6.0

        或在 idea 中运行参数设置 HADOOP_HOME


        image
  1. 不能访问 metastore, 无法实例化 SessionHiveMetaStoreClient

    原因: 在上面 pom.xml 中把整合 HBsae 的相关jar引入后,访问 Hive 时会报以下异常,与未整合 HBsae 报错不一样。解决办法同上。

    问题日志:

    log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).
    log4j:WARN Please initialize the log4j system properly.
    log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    18/06/27 10:47:01 INFO metastore: Trying to connect to metastore with URI thrift://cdh01:9083
    18/06/27 10:47:01 WARN Hive: Failed to access metastore. This class should not accessed in runtime.
    org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
        at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1236)
        at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174)
        at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166)
        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
        at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:191)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
        at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:362)
        at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:266)
        at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:66)
        at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:65)
        at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:194)
        at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194)
        at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194)
        at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
        at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:193)
        at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:105)
        at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:93)
        at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39)
        at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54)
        at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52)
        at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:35)
        at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:289)
        at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1050)
        at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130)
        at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130)
        at scala.Option.getOrElse(Option.scala:121)
        at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:129)
        at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:126)
        at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:938)
        at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:938)
        at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130)
        at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130)
        at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
        at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
        at scala.collection.mutable.HashMap.foreach(HashMap.scala:130)
        at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:938)
        at com.cloudera.SparkHiveAPP$.main(SparkHiveAPP.scala:24)
        at com.cloudera.SparkHiveAPP.main(SparkHiveAPP.scala)
    Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
        at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
        at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
        at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
        at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234)
        ... 41 more
    Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
        ... 47 more
    Caused by: java.lang.NullPointerException
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
        at org.apache.hadoop.util.Shell.run(Shell.java:455)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:791)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:774)
        at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:84)
        at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:52)
        at org.apache.hadoop.security.Groups.getGroups(Groups.java:139)
        at org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1474)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:436)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:236)
        at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
        ... 52 more
    18/06/27 10:47:01 INFO metastore: Trying to connect to metastore with URI thrift://cdh01:9083
    Exception in thread "main" java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':
        at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1053)
        at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130)
        at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:130)
        at scala.Option.getOrElse(Option.scala:121)
        at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:129)
        at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:126)
        at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:938)
        at org.apache.spark.sql.SparkSession$Builder$$anonfun$getOrCreate$5.apply(SparkSession.scala:938)
        at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130)
        at scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:130)
        at scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:236)
        at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
        at scala.collection.mutable.HashMap.foreach(HashMap.scala:130)
        at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:938)
        at com.cloudera.SparkHiveAPP$.main(SparkHiveAPP.scala:24)
        at com.cloudera.SparkHiveAPP.main(SparkHiveAPP.scala)
    Caused by: org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;
        at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)
        at org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:193)
        at org.apache.spark.sql.internal.SharedState.externalCatalog$lzycompute(SharedState.scala:105)
        at org.apache.spark.sql.internal.SharedState.externalCatalog(SharedState.scala:93)
        at org.apache.spark.sql.hive.HiveSessionStateBuilder.externalCatalog(HiveSessionStateBuilder.scala:39)
        at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog$lzycompute(HiveSessionStateBuilder.scala:54)
        at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:52)
        at org.apache.spark.sql.hive.HiveSessionStateBuilder.catalog(HiveSessionStateBuilder.scala:35)
        at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:289)
        at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1050)
        ... 15 more
    Caused by: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
        at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:191)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:264)
        at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:362)
        at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:266)
        at org.apache.spark.sql.hive.HiveExternalCatalog.client$lzycompute(HiveExternalCatalog.scala:66)
        at org.apache.spark.sql.hive.HiveExternalCatalog.client(HiveExternalCatalog.scala:65)
        at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:194)
        at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194)
        at org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:194)
        at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
        ... 24 more
    Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
        at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
        at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
        at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
        ... 38 more
    Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
        ... 44 more
    Caused by: java.lang.NullPointerException
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
        at org.apache.hadoop.util.Shell.run(Shell.java:455)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:791)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:774)
        at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:84)
        at org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:52)
        at org.apache.hadoop.security.Groups.getGroups(Groups.java:139)
        at org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1474)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:436)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:236)
        at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
        ... 49 more
    
    Process finished with exit code 1
    
    
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 213,014评论 6 492
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 90,796评论 3 386
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 158,484评论 0 348
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 56,830评论 1 285
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 65,946评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,114评论 1 292
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,182评论 3 412
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 37,927评论 0 268
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,369评论 1 303
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,678评论 2 327
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,832评论 1 341
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,533评论 4 335
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,166评论 3 317
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,885评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,128评论 1 267
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,659评论 2 362
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,738评论 2 351

推荐阅读更多精彩内容