spark下载地址: https://archive.apache.org/dist/spark/
一、配置Spark环境变量及将jars中 以下jar包拷贝至hive lib目录中
spark-core_2.12-3.0.0.jar
spark-kvstore_2.12-3.0.0.jar
spark-launcher_2.12-3.0.0.jar
spark-network-common_2.12-3.0.0.jar
spark-network-shuffle_2.12-3.0.0.jar
spark-tags_2.12-3.0.0.jar
spark-unsafe_2.12-3.0.0.jar
二、在Hive中创建spark配置文件
1、vim /opt/module/hive/conf/spark-defaults.conf
spark.master yarn
spark.eventLog.enabled true
spark.eventLog.dir hsfs://hadoop102:8020/spark-history spark.executor.memory 1g
spark.driver.memory 1g
2、hadoop fs -mkdir /spark-history
三、上传纯净版spark jar包至hdfs
1、hadoop fs -mkdir /spark-jars
2、hadoop fs -put /opt/module/spark-3.0.0-bin-without-hadoop/jars/* /spark-jars
四、修改Hive中hive-site.xml
<!--spark依赖包(注:端口号必须和hadoop中namenode端口号一致) -->
<property>
<name>spark.yarn.jars</name>
<value>hdfs://hadoop102:8020/spark-jars/*</value>
</property>
<!--hive执行引擎 -->
<property>
<name>hive.execution.engine</name>
<value>spark</value>
</property>
<!--hive和spark连接超时时间 -->
<property>
<name>hive.spark.client.connect.timeout</name>
<value>10000ms</value>
</property>