Hadoop3+Hive3安装记录(虚拟机搭建分布式环境)

说明:相关文章内容为本人学习记录,参考网络分享,如有侵权联系删除!仅供技术分享非商用!


Hadoop官方下载地址

Hive官方下载地址


概述

工具:VMware 14

目的:创建三个虚拟机,网络以桥接模式,三台虚拟机在同一网段,保证三台机器能够相互ping通。

流程步骤:
① 下载 CentOS7 ISO镜像使用VM创建第一个虚拟机;
② 通过 VM 克隆创建剩下两个虚拟机;
③ 设置三个系统的主机名以及网络,并相互设置ssh免密登录;
④ 安装JDK;
⑤ 安装 Hadoop3.2 ;
⑥ 安装 Hive3.1 ;


配置Linux主机信息

  • 修改主机名称(CentOS7)
 # 使用这个命令会立即生效且重启也生效
[root@smallsuperman ~]# hostnamectl set-hostname outman00
[root@smallsuperman ~]# hostname
outman00

 # 编辑下hosts文件, 给127.0.0.1添加hostname
[root@smallsuperman ~]# vi /etc/hosts  
[root@smallsuperman ~]# cat /etc/hosts 
127.0.0.1   localhost smallsuperman.centos localhost4 localhost4.localdomain4 outman00
::1         localhost smallsuperman.centos localhost6 localhost6.localdomain6
  • 内网映射
# 主机名代替ip访问
[root@smallsuperman ~]# sed -i '$a\192.168.233.132 outman00' /etc/hosts
[root@smallsuperman ~]# sed -i '$a\192.168.233.130 outman01' /etc/hosts
[root@smallsuperman ~]# sed -i '$a\192.168.233.131 outman02' /etc/hosts
[root@smallsuperman ~]# ping outman00 # 测试通否
PING localhost (127.0.0.1) 56(84) bytes of data
  • 关闭防火墙
# 检查防火墙状态
# 看到绿色字样标注的“active(running)”,说明防火墙是开启状态
# disavtive(dead)的字样,说明防火墙已经关闭
[root@smallsuperman ~]# systemctl status firewalld.service

# 关闭运行的防火墙
[root@smallsuperman ~]# systemctl stop firewalld.service

# 禁止防火墙服务器,系统重启不会开启防火墙
[root@smallsuperman ~]# systemctl disable firewalld.service
  • 创建Hadoop用户,并赋权
# 增加一个用户
[root@smallsuperman ~]# adduser hadoop

# 赋权
以root用户身份为hadoop用户赋权,在 root 账号下,命令终端输入:vi /etc/sudoers
找到
root ALL=(ALL) ALL
添加一行内容
hadoop ALL=(ALL) ALL

  • 设置SSH免密登录
# 每个主机都生成密钥(一直回车) 
[hadoop@outman02 ~]$ ssh-keygen

Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): 
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:rKWn0xjk+J3ZShFMO3pYx6sBxCpl1YkUojNsG4HQ1iE hadoop@outman02
The key's randomart image is:
+---[RSA 2048]----+
|ooE.o==+..       |
|..o++ooooo       |
| .Bo .. * o      |
| ..=. .* + .     |
|  .. +o S .      |
|    . o= +       |
|     .o=++       |
|      ++= .      |
|      ....       |
+----[SHA256]-----+

# 与其他主机建立免密连接,将自己的公钥拷贝至其他主机的authorized_keys文件中。
[hadoop@outman02 ~]$ ssh-copy-id outman01
[hadoop@outman02 ~]$ ssh-copy-id outman00
[hadoop@outman01 ~]$ ssh-copy-id outman00
[hadoop@outman01 ~]$ ssh-copy-id outman02
[hadoop@outman00 ~]$ ssh-copy-id outman01
[hadoop@outman00 ~]$ ssh-copy-id outman02

# 测试一下免密登录
[hadoop@outman00 ~]$ ssh hadoop@outman01
Last failed login: Mon Jun  3 02:16:04 CST 2019 from outman00 on ssh:notty
There were 3 failed login attempts since the last successful login.
Last login: Mon Jun  3 02:13:15 2019
[hadoop@outman01 ~]$ pwd
/home/hadoop

安装JDK

1、去官网下载软件的安装压缩包
2、上传一份到主机临时目录下(我的是在/tmp/tar_gz),然后通过 scp 到另外两个服务器相同位置;

  • 解压到指定目录下
[root@outman02 tar_gz]# tar -zxvf jdk-8u211-linux-x64.tar.gz -C /usr/local/my_app
  • 配置环境变量
[root@outman00 jdk1.8.0_211]# sed  -i '$a\\nexport JAVA_HOME=/usr/local/my_app/jdk1.8.0_211\nexport PATH=$PATH:$JAVA_HOME/bin ' /etc/profile
# 更新加载环境变量 
[root@outman00 jdk1.8.0_211]# source /etc/profile
# 检查是否安装成功
[root@outman00 jdk1.8.0_211]# java -version
java version "1.8.0_211"
Java(TM) SE Runtime Environment (build 1.8.0_211-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)

安装Hadoop

主要目录
(1)bin目录:存放对Hadoop相关服务(HDFS,YARN)进行操作的脚本
(2)etc目录:Hadoop的配置文件目录,存放Hadoop的配置文件
(3)lib目录:存放Hadoop的本地库(对数据进行压缩解压缩功能)
(4)sbin目录:存放启动或停止Hadoop相关服务的脚本
(5)share目录:存放Hadoop的依赖jar包、文档、和官方案例

  • 解压到指定目录下
[root@outman00 tar_gz]# tar -zxvf hadoop-3.2.0.tar.gz  -C /usr/local/my_app/hadoop

修改配置文件

  • 进入解压路径
[root@outman00 hadoop-3.2.0]# cd /usr/local/my_app/hadoop/hadoop-3.2.0/etc/hadoop/
  • 修改配置文件中JDK路径
[root@outman00 hadoop]# vi /usr/local/my_app/hadoop/hadoop-3.2.0/etc/hadoop/hadoop-env.sh

# 修改内容,添加JDK的路径信息
    52 # The java implementation to use. By default, this environment
    53 # variable is REQUIRED on ALL platforms except OS X!
    54 export JAVA_HOME=/usr/local/my_app/jdk1.8.0_211
  • 修改 core-site.xml 文件

fs.defaultFS HDFS中NameNode的地址 端口。
hadoop.tmp.dir Hadoop运行时产生文件的存储目录。

<configuration>
<property>
 <name>fs.defaultFS</name>
 <value>hdfs://outman00:9000</value>
</property>
<property>
 <name>hadoop.tmp.dir</name>
 <value>/usr/local/my_app/hadoop/hadoop_data</value>
</property>
</configuration>
  • 修改 hdfs-site.xml
<configuration>
<property>
  <name>dfs.namenode.name.dir</name>
  <value>/usr/local/my_app/hadoop/hadoop_data/namenode_data</value>
  <description>元数据存储目录,安全起见可配置到其他目录</description>
</property>

<property>
  <name>dfs.datanode.data.dir</name>
  <value>/usr/local/my_app/hadoop/hadoop_data/datanode_data</value>
  <description>datanode 的数据存储目录</description>
</property>

<property>
   <name>dfs.replication</name>
   <value>2</value>
   <description>HDFS 的数据块的副本个数</description>
</property>

<property>
   <name>dfs.secondary.http.address</name>
   <value>outman01:50090</value>
   <description>secondarynamenode 节点信息,最好是和namenode 设置为不同节点</description>
   </property>
</configuration>
  • 修改 yarn-site.xml

yarn.nodemanager.aux-services YARN 集群为 MapReduce 程序提供的 shuffle 服务
yarn.resourcemanager.hostname ResourceManager的信息

<configuration>

<!-- Reducer获取数据的方式 -->
<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>

<!-- 指定YARN的ResourceManager的地址 -->
<property>
    <name>yarn.resourcemanager.hostname</name>
    <value>outman00</value>
</property>

</configuration>
  • 修改 mapred-site.xml

配置采用yarn作为资源调度框架(指定MR运行在YARN上)

<configuration>
<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
</property>
</configuration>
  • 修改workers (3.0之前是slaves文件 )
outman00
outman01
outman02
  • 把安装的Hadoop目录文件分发给另外两个节点(scp过去)
[hadoop@outman00 hadoop]$ scp -r hadoop-3.2.0/ hadoop@outman02:/usr/local/my_app/hadoop
[hadoop@outman00 hadoop]$ scp -r hadoop-3.2.0/ hadoop@outman01:/usr/local/my_app/hadoop
  • 配置环境变量(三个节点 /etc/profile)
[root@outman00 hadoop]# sed -i '$a\export HADOOP_HOME=/usr/local/my_app/hadoop/hadoop-3.2.0\nexport PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin' /etc/profile

# 更新环境变量配置
[root@outman00 hadoop]# source /etc/profile

# 验证
[root@outman02 hadoop]# hadoop --help

  • 初始化namenode

在HDFS 主节点(core-site.xml中配置的fs.defaultFS),执行初始化命令,成功后会根据配置的信息创建对应的data目录如果需要重新初始化,删除后重新执行即可!

[root@outman00 hadoop]# hadoop namenode -format

# 判断成功关键信息
2019-06-05 01:58:12,198 INFO common.Storage: Storage directory /usr/local/my_app/hadoop/hadoop_data/namenode_data has been successfully formatted.
  • 启动hdfs(执行启动脚本任意节点)
[hadoop@outman00 ~]$ start-dfs.sh

[hadoop@outman00 ~]$ jps
8949 DataNode
8840 NameNode
9229 Jps
[hadoop@outman01 hadoop_data]$ jps
8071 SecondaryNameNode
8137 Jps
7997 DataNode
[hadoop@outman02 hadoop_data]$ jps
7973 Jps
7817 DataNode
  • 出现问题

    • NameDode 主节点可以访问hdfs,但是另外两个节点无法访问
  • 报错信息
# 主节点正常访问
[hadoop@outman00 ~]$ hadoop fs -ls /
Found 1 items
drwxr-xr-x   - hadoop supergroup          0 2019-06-06 00:34 /dyp

# 次节点无法访问
[hadoop@outman01 ~]$ hadoop fs -ls /
ls: Call From outman01/192.168.233.130 to outman00:9000 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

# 次节点无法访问
[hadoop@outman02 ~]$ hadoop fs -ls /
ls: Call From localhost/127.0.0.1 to outman00:9000 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
  • 根据配置信息 core-site.xml 检查是否可以访问主节点的9000端口
# 主节点可以
[root@outman00 datanode_data]# telnet  outman00 9000
Trying 127.0.0.1...
Connected to outman00.
Escape character is '^]'.

# 次节点不可以
[root@outman01 ~]# telnet  outman00 9000
Trying 192.168.233.132...
telnet: connect to address 192.168.233.132: Connection refused
[root@outman02 xinetd.d]# telnet  outman00 9000
Trying 192.168.233.132...
telnet: connect to address 192.168.233.132: Connection refused
  • 检查主节点9000端口占用情况

发现9000端口被 127.0.0.1:本地占用,也就是只有本地才能访问 (HDFS监听的9000端口默认绑定127.0.0.1地址)

[root@outman00 datanode_data]# lsof -i:9000
COMMAND  PID   USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
java    7339 hadoop  269u  IPv4  42236      0t0  TCP localhost:cslistener (LISTEN)
java    7339 hadoop  279u  IPv4  44037      0t0  TCP localhost:cslistener->localhost:51560 (ESTABLISHED)
java    7413 hadoop  328u  IPv4  44036      0t0  TCP localhost:51560->localhost:cslistener (ESTABLISHED)
[root@outman00 datanode_data]# netstat -tunlp |grep 9000
tcp        0      0 127.0.0.1:9000          0.0.0.0:*               LISTEN      7339/java   
  • 尝试修改host配置
#127.0.0.1   localhost smallsuperman.centos localhost4 localhost4.localdomain4 outman00
#::1         localhost smallsuperman.centos localhost6 localhost6.localdomain6
192.168.233.132 outman00
192.168.233.130 outman01
192.168.233.131 outman02
  • 重启hadoop
[hadoop@outman00 ~]$ stop-all.sh
[hadoop@outman00 ~]$ start-all.sh
  • 检查主节点NameNode上9000占用(占用的ip变成了主节点其他节点可以访问了)
[root@outman00 datanode_data]# netstat -tunlp | grep 9000
tcp        0      0 192.168.233.132:9000    0.0.0.0:*               LISTEN      10843/java 
  • 其他节点访问hdfs
[hadoop@outman01 ~]$ hadoop fs -ls /
Found 1 items
drwxr-xr-x   - hadoop supergroup          0 2019-06-06 00:34 /dyp
[hadoop@outman02 ~]$ hadoop fs -ls /
Found 1 items
drwxr-xr-x   - hadoop supergroup          0 2019-06-06 00:34 /dyp
  • 访问 HDFS 前台页面

http://192.168.233.132:9870

特别注意

hadoop3.0之前web访问端口是50070
hadoop3.0之后web访问端口为9870


  • 启动yarn(要在yarn的主节点启动)
[hadoop@outman00 hadoop_data]$ start-yarn.sh
Starting resourcemanager
Starting nodemanagers

# 查看进程(主节点增加 ResourceManager 、NodeManager 其他节点增加 NodeManager)
[hadoop@outman00 hadoop_data]$ jps
9811 Jps
8949 DataNode
9463 NodeManager
8840 NameNode
9353 ResourceManager
[hadoop@outman01 hadoop_data]$ jps
8227 NodeManager
8071 SecondaryNameNode
8327 Jps
7997 DataNode
  • 测试yarn
[hadoop@outman00 mapreduce]$ hadoop jar hadoop-mapreduce-examples-3.2.0.jar wordcount  /dyp/test/test  /dyp/test/test_out

# 报错
[2019-06-06 02:01:09.415]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
错误: 找不到或无法加载主类 org.apache.hadoop.mapreduce.v2.app.MRAppMaster
# 解决方法  :  在配置文件 mapred-site.xml 文件中添加 mapreduce 程序所用到的 classpath 如下
# /usr/local/my_app/hadoop/hadoop-3.2.0/ 就是hadoop安装路径
<configuration>
<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
</property>

<property>
    <name>mapreduce.application.classpath</name>
    <value>/usr/local/my_app/hadoop/hadoop-3.2.0/share/hadoop/mapreduce/*, /usr/local/my_app/hadoop/hadoop-3.2.0/share/hadoop/mapreduce/lib/*</value>
</property>
</configuration>
  • 再次测试
[hadoop@outman00 mapreduce]$ hadoop jar hadoop-mapreduce-examples-3.2.0.jar wordcount  /dyp/test/test  /dyp/test/test_out

[hadoop@outman00 mapreduce]$ hadoop fs -ls  /dyp/test/test_out
Found 2 items
-rw-r--r--   2 hadoop supergroup          0 2019-06-06 02:16 /dyp/test/test_out/_SUCCESS
-rw-r--r--   2 hadoop supergroup         29 2019-06-06 02:16 /dyp/test/test_out/part-r-00000
[hadoop@outman00 mapreduce]$ hadoop fs -cat  /dyp/test/test_out/part-r-00000
1|2|3   1
A|B|C   1
A|B|C1|2|3  1

安装 MySQL

我这里使用腾讯云服务器上安装在 Docker 中的 MySQL,所以在虚拟上只需要安装MySQL的客户端就可以了,只用于访问

  • 安装mysql-client
[root@outman00 ~]# yum  install mysql

# 连接腾讯云MySQL
[root@outman00 ~]# mysql -h 腾讯云MySQL的IP -u root -p

安装Hive3

  • 解压到指定目录下
[root@outman00 tar_gz]# tar -zxvf apache-hive-3.1.1-bin.tar.gz -C /usr/local/my_app/hive

-rw-r--r--. 1 root root  2293144 6月   7 02:07 mysql-connector-java-8.0.16.jar
[root@outman00 lib]# pwd
/usr/local/my_app/hive/hive-3.1.1/lib
  • 配置Hive环境变量
[root@outman00 lib]# sed -i '$a\export HIVE_HOME=/usr/local/my_app/hive/hive-3.1.1\nexport PATH=$PATH:$HIVE_HOME/bin' /etc/profile

# 更新生效
[root@outman00 lib]# source /etc/profile
  • 修改配置文件
[root@outman00 conf]# cd /usr/local/my_app/hive/hive-3.1.1/conf
[root@outman00 conf]# cp hive-env.sh.template hive-env.sh
[root@outman00 conf]# cp hive-default.xml.template hive-site.xml
  • hive-env.sh 添加以下内容
export JAVA_HOME=/usr/local/my_app/jdk1.8.0_211

export HADOOP_HOME=/usr/local/my_app/hadoop/hadoop-3.2.0

export HIVE_HOME=/usr/local/my_app/hive/hive-3.1.1
  • 更新生效
[root@outman00 conf]# source hive-env.sh
  • 修改 hive-site.xml

  • 先创建目录
[root@outman00 conf]# mkdir -p /usr/local/my_app/hive/hive_data/warehouse
[root@outman00 conf]# mkdir -p /usr/local/my_app/hive/hive_data/tmp
[root@outman00 conf]# mkdir -p /usr/local/my_app/hive/hive_data/log
  • 修改 hive-site.xml
<property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore</description>
</property>
<property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>用户名</value>
    <description>username to use against metastore database</description>
 </property>
<property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>密码</value>
    <description>password to use against metastore database</description>
</property>
<property>
    <name>hive.metastore.warehouse.dir</name>
    <value>/usr/local/my_app/hive/hive_datawarehouse</value>
    <description>location of default database for the warehouse</description>
</property>
<property>
    <name>hive.exec.scratchdir</name>
    <value>/usr/local/my_app/hive/hive_data/tmp</value>
    <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/&lt;username&gt; is created, with ${hive.scratch.dir.permission}.</description>
</property>
<property>
    <name>hive.querylog.location</name>
    <value>/usr/local/my_app/hive/hive_data/log</value>
    <description>Location of Hive run time structured log file</description>
</property>

# 我们把变量  $system:java.io.tmpdir 替换成我们的临时数据存放目录 /usr/local/my_app/hive/hive_data/tmp
  • 修改 hive-log4j.proprties 文件

  • hive启动报错如下

Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
at [row,col,system-id]: [3186,96,"file:/usr/local/my_app/hive/hive-3.1.1/conf/hive-site.xml"

配置文件 hive-site.xml 3186行96个字符不合法

# 详细报错
Exception in thread "main" java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
 at [row,col,system-id]: [3186,96,"file:/usr/local/my_app/hive/hive-3.1.1/conf/hive-site.xml"]
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2981)
    at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2930)
    at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2805)
    at org.apache.hadoop.conf.Configuration.get(Configuration.java:1459)
    at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:4990)
    at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:5063)
    at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:5150)
    at org.apache.hadoop.hive.conf.HiveConf.<init>(HiveConf.java:5093)
    at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:97)
    at org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:81)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:699)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
 at [row,col,system-id]: [3186,96,"file:/usr/local/my_app/hive/hive-3.1.1/conf/hive-site.xml"]
    at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:621)
    at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:491)
    at com.ctc.wstx.sr.StreamScanner.reportIllegalChar(StreamScanner.java:2456)
    at com.ctc.wstx.sr.StreamScanner.validateChar(StreamScanner.java:2403)
    at com.ctc.wstx.sr.StreamScanner.resolveCharEnt(StreamScanner.java:2369)
    at com.ctc.wstx.sr.StreamScanner.fullyResolveEntity(StreamScanner.java:1515)
    at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2828)
    at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1123)
    at org.apache.hadoop.conf.Configuration$Parser.parseNext(Configuration.java:3277)
    at org.apache.hadoop.conf.Configuration$Parser.parse(Configuration.java:3071)
    at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2964)
    ... 17 more
  • 启动 metastore 报错

报错信息

hadoop@outman00 hive-3.1.1]$  hive --service metastore 

2019-06-07 23:25:27: Starting Hive Metastore Server
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/my_app/hive/hive-3.1.1/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/my_app/hadoop/hadoop-3.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
MetaException(message:Version information not found in metastore.)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:84)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:93)
    at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:8661)
    at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:8656)
    at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:8926)
    at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:8843)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:236)

分析报错

  • jar包冲突,关键信息
[jar:file:/usr/local/my_app/hive/hive-3.1.1/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/my_app/hadoop/hadoop-3.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class
  • 删除 HVIE_HOME/bin 下的jar
  • 不要删除HADOOP_HOME/bin下的jar包否则start-all.sh远程启动hadoop时会报找不到log4j包的错误。
rm -rf /usr/local/my_app/hive/hive-3.1.1/lib/log4j-slf4j-impl-2.10.0.jar
  • 启动metastore仍然报错内容如下
2019-06-07 23:41:43: Starting Hive Metastore Server
MetaException(message:Version information not found in metastore.)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:84)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:93)
    at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:8661)
    at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:8656)
    at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:8926)
    at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:8843)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
  • 尝试解决 在hvie-site.xml(继续报错)

# 关闭元数据验证
<property>     
    <name>datanucleus.metadata.validate</name>    
    <value>false</value>    
</property> 
# 关闭元数据存储模式验证
<property>  
    <name>hive.metastore.schema.verification</name> 
    <value>false</value> 
</property>
<property>
    <name>datanucleus.schema.autoCreateAll</name>
    <value>ture</value>
</property>
# 其中hive.metastore.schema.verification防止架构版本不兼容时的 Metastore 操作。考虑将此设置为“True”,以减少 Metastore 操作期间发生架构损坏的可能性

  • 如果是第一次需要执行初始化命令:schematool -dbType mysql -initSchema
[hadoop@outman00 hive-3.1.1]$ schematool -dbType mysql -initSchema

# 发现MySQL中创建了hive库
MySQL [(none)]> show databases;
+--------------------+
| Database           |
+--------------------+
| dyp                |
| hive               |
| information_schema |
| mysql              |
| performance_schema |
| sys                |
+--------------------+

正常启动 metastore 后 进入 hive 交互,报错如下

hive> show databases;
OK
Failed with exception java.io.IOException:java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:user.name%7D
  • 修改配置
  <property>
    <name>hive.exec.local.scratchdir</name>
    <value>/usr/local/my_app/hive/hive_data/tmp/${user.name}</value>
    <description>Local scratch space for Hive jobs</description>
  </property>

启动后节点信息

启动后节点信息

outman00 outman01 outman02
DataNode
NameNode
NodeManager
ResourceManager
DataNode
NodeManager
SecondaryNameNode
DataNode
NodeManager

概念说明

  • NameNode

负责管理整个 HDFS 文件系统的元数据:配置副本策略、管理我们存储数据块(Block)映射信息、管理HDFS名称空间、处理客户端读写请求

  • SecondaryNameNode

是 NameNode 的辅助分担任务,定期合并fsimage和edits文件,可以辅助恢复NameNode;

  • DataNode

负责管理用户的文件数据块:根据NameNode下发的任务命令,DataNode去执行对应的操作。(存储实际数据块、执行数据块的读写操作)
文件会按照固定大小(blocksize)来切分成块后分布式存储在若干台DataNode上
每一个文件快可以有多个副本,并存放在不同的 DataNode 上 DataNode 会定期向 NameNode 汇报自身所保存的文件block信息,而 NameNode 则会负责保持文件的副本数量(当减少DataNode的时候,NameNode才知道当前副本状态,从而进行副本维持)

  • YARN 负责将系统资源分配给在 Hadoop 集群中运行的各种应用程序,并调度要在不同集群节点上执行的任务
  • YARN 的组件 ResourceManager

ResourceManager 是在系统中的所有应用程序之间仲裁资源的最终权限。

  • YARN 的组件 ResourceManager

NodeManager 是每台机器框架代理,负责 Containers,监视其资源使用情况(CPU,内存,磁盘,网络)并将其报告给 ResourceManager。

©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 212,080评论 6 493
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 90,422评论 3 385
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 157,630评论 0 348
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 56,554评论 1 284
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 65,662评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 49,856评论 1 290
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,014评论 3 408
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 37,752评论 0 268
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,212评论 1 303
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,541评论 2 327
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,687评论 1 341
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,347评论 4 331
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,973评论 3 315
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,777评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,006评论 1 266
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,406评论 2 360
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,576评论 2 349

推荐阅读更多精彩内容