一. Linux环境准备
- 关闭防火墙
yum install iptables-services
service iptables stop
永久关闭:
chkconfig iptables off
或者
systemctl disable iptables.service - 禁用SELinux
yum install -y vim*
修改SELINUX为:
vim /etc/sysconfig/selinux
SELINUX=disabled
- 修改hostname,ip和主机名的对应的映射
该步骤不必须可不映射
vim /etc/hosts
重启重启慎重
192.168.100.21 hadoop01
192.168.100.22 hadoop02
192.168.100.23 hadoop03
reboot
- JDK安装
查看是否安装java
rpm -qa|grep jdk
rpm -qa|grep java
安装文件上传下载插件
yum install -y lrzsz
上传JDK文件 rz
tar -zxvf jdk-8u141-linux-x64.tar.gz
修改环境变量:
vi /etc/profile
# jdk1.8
export JAVA_HOME=/usr/local/jdk1.8.0_141
export PATH=$PATH:$JAVA_HOME/bin
source /etc/profile
验证:java -version
- sodo权限配置
groupadd hadoop #增加新用户组
useradd hadoop -m -g hadoop #增加新用户
passwd hadoop #修改hadoop用户的密码
123456
切换用户 su hadoop
查看当前用户 whoami
切换回root:
su root
- 给/etc/sudoers文件赋权:
chmod u+x /etc/sudoers
vim /etc/sudoers
- 添加hadoop01权限
这里以hadoop01为例
root ALL=(ALL) ALL
hadoop ALL=(ALL) ALL
-
测试sodo给文件夹赋权限:
sudo ls /root
sudo su hadoop01
- 各服务器间的免密码登录配置,分别在各自服务中执行一次
切换至hadoop用户:
su hadoop
ssh-keygen -t rsa #一直按回车,会生成公私钥
ssh-copy-id hadoop@hadoop01
ssh-copy-id hadoop@hadoop02
ssh-copy-id hadoop@hadoop03
注:以上操作需要登录到hadoop用户操作
二.开始安装部署
下载hadoop包,hadoop-2.9.2.tar.gz
官网地址:https://archive.apache.org/dist/hadoop/common/hadoop-2.9.2/
(1)创建hadoop安装目录
mkdir -p /home/hadoop/app/hadoop/{tmp,hdfs/{data,name}}
(2)将安装包解压至/home/hadoop/app/hadoop下
cd /home/hadoop/app/hadoop
wget --no-check-certificate https://archive.apache.org/dist/hadoop/common/hadoop-2.9.2/hadoop-2.9.2.tar.gz
tar -zxvf hadoop-2.9.2.tar.gz -C /home/hadoop/app/hadoop
(3)配置hadoop的环境变量,修改
退回到root用户
exit
vim /etc/profile
# set Hadoop path
export HADOOP_HOME=/home/hadoop/app/hadoop/hadoop-2.9.2
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HADOOP_HOME_WARN_SUPPRESS=1
(4)刷新环境变量
source /etc/profile
配置Hadoop
(1)配置core-site.xml
vim /home/hadoop/app/hadoop/hadoop-2.9.2/etc/hadoop/core-site.xml
<configuration>
<property>
<!-- 配置HDFS的NameNode所在节点服务器 -->
<name>fs.defaultFS</name>
<value>hdfs://hadoop01:9000</value>
</property>
<property>
<!-- 配置Hadoop的临时目录 -->
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/app/hadoop/tmp</value>
</property>
</configuration>
默认配置地址:http://hadoop.apache.org/docs/r2.9.2/hadoop-project-dist/hadoop-common/core-default.xml
(2)配置hdfs-site.xml SecondaryNameNode地址配置
vim /home/hadoop/app/hadoop/hadoop-2.9.2/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<!-- 配置HDFS的DataNode的备份数量 -->
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/app/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/app/hadoop/hdfs/data</value>
</property>
<property>
<!-- 配置HDFS的权限控制 -->
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
<property>
<!-- 配置SecondaryNameNode的节点地址 -->
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop02:50090</value>
</property>
</configuration>
默认配置地址:http://hadoop.apache.org/docs/r2.9.2/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
(3)配置mapred-site.xml
cp /home/hadoop/app/hadoop/hadoop-2.9.2/etc/hadoop/mapred-site.xml.template /home/hadoop/app/hadoop/hadoop-2.9.2/etc/hadoop/mapred-site.xml
vim /home/hadoop/app/hadoop/hadoop-2.9.2/etc/hadoop/mapred-site.xml
<configuration>
<property>
<!-- 配置MR运行的环境 -->
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
(4)配置yarn-site.xml
vim /home/hadoop/app/hadoop/hadoop-2.9.2/etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<!-- 配置ResourceManager的服务节点 -->
<name>yarn.resourcemanager.hostname</name>
<value>hadoop01</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoop01:8032</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop01:8088</value>
</property>
</configuration>
默认配置地址:http://hadoop.apache.org/docs/r2.9.2/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
(5)配置slaves
vim /home/hadoop/app/hadoop/hadoop-2.9.2/etc/hadoop/slaves
hadoop03
slaves文件中配置的是DataNode的所在节点服务
(6)配置hadoop-env
修改hadoop-env.sh文件的JAVA_HOME环境变量,操作如下:
vim /home/hadoop/app/hadoop/hadoop-2.9.2/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/local/jdk1.8.0_141
(7)配置yarn-env
修改yarn-env.sh文件的JAVA_HOME环境变量,操作如下:
vim /home/hadoop/app/hadoop/hadoop-2.9.2/etc/hadoop/yarn-env.sh
在文件头部新加一行
export JAVA_HOME=/usr/local/jdk1.8.0_141
(8)配置mapred-env
修改mapred-env.sh文件的JAVA_HOME环境变量,操作如下:
vim /home/hadoop/app/hadoop/hadoop-2.9.2/etc/hadoop/mapred-env.sh
在文件头部新加一行
export JAVA_HOME=/usr/java/jdk1.8.0_131
(9)将配置好的hadoop分别远程拷贝至其他服务器中
切换用户hadoop
su hadoop
sudo chmod -R 777 /home/hadoop/app/hadoop/hdfs/name
scp -r /home/hadoop/app/hadoop hadoop@hadoop02:/home/hadoop/app/
scp -r /home/hadoop/app/hadoop hadoop@hadoop03:/home/hadoop/app/
5 启动测试
(1)在hadoop01节点中初始化Hadoop集群文件系统格式化
参考
hadoop namenode -format
(2)启动Hadoop集群
start-dfs.sh
NameNode hadoop01
SecondaryNameNode hadoop02
DataNode hadoop03
start-yarn.sh
ResourceManager hadoop01
NodeManager hadoop03
(3)验证集群是否成功
-
查看hadoop01的服务
yum -y install net-tools
netstat -tnlp
- 关闭防火墙
systemctl stop firewalld.service -
浏览器中访问50070的端口,如下证明集群部署成功(hdfs)
-
浏览器中访问8088的端口,如下证明集群部署成功(yarn)