Hadoop集群环境搭建
1. 服务器ip列表配置信息
192.168.10.128 hufu01
192.168.10.129 hufu02
192.168.10.130 hufu03
192.168.10.131 hufu04
2. 每台服务器的硬件和操作系统信息
##查看当前操作系统版本信息
[root@hufu02 ~]# cat /proc/version
Linux version 3.10.0-957.el7.x86_64 (mockbuild@kbuilder.bsys.centos.org) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC) ) #1 SMP Thu Nov 8 23:39:32 UTC 2018
##查看版本当前操作系统内核信息
[root@hufu02 ~]# uname -a
Linux hufu02 3.10.0-957.el7.x86_64 #1 SMP Thu Nov 8 23:39:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
##查看版本当前操作系统发行信息
[root@hufu02 ~]# cat /etc/centos-release
CentOS Linux release 7.6.1810 (Core)
3. 设置免密登录
## 生成密钥,在四台服务器上分别执行
[root@hufu01 ~]# ssh-keygen -t rsa(一路回车到底)
##复制密钥,在hufu01上执行
[root@hufu01 ~]# ssh-copy-id hufu02 (yes,输入密码)
[root@hufu01 ~]# ssh-copy-id hufu03 (yes,输入密码)
[root@hufu01 ~]# ssh-copy-id hufu04 (yes,输入密码)
## 测试是否成功:
[root@hufu01 ~]# ssh hufu02
Last login: Fri Aug 16 02:43:39 2019 from 192.168.10.1
[root@hufu02 ~]# exit
logout
Connection to hufu02 closed.
[root@hufu01 ~]# ssh hufu03
Last login: Fri Aug 16 00:47:27 2019 from 192.168.10.1
[root@hufu03 ~]# exit
logout
Connection to hufu03 closed.
[root@hufu01 ~]# ssh hufu04
Last login: Fri Aug 16 00:47:27 2019 from 192.168.10.1
[root@hufu04 ~]# exit
logout
Connection to hufu04 closed.
## 配置成功
4. 新增hadoop账号,并且配置免密登录
## 生成密钥,在四台服务器上分别执行
[root@hufu01 ~]# useradd hadoop
[root@hufu01 ~]# passwd hadoop
Changing password for user hadoop.
New password: ##这里设置密码为hadoop
BAD PASSWORD: The password is shorter than 8 characters
Retype new password:
passwd: all authentication tokens updated successfully.
##配置免密登录(先切换到hadoop用户,再去配置),略去
5. 安装jdk
##1. 安装略去,jdk的路径和版本信息如下
[hadoop@hufu01 app]$ echo $JAVA_HOME
/app/jdk
[hadoop@hufu01 app]$ java -version
java version "1.8.0_221"
Java(TM) SE Runtime Environment (build 1.8.0_221-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.221-b11, mixed mode)
##2. 将hufu01上安装好的jdk 发送到 hufu02,hufu03,hufu04上
6.安装hadoop(在hadoop用户下)
6.1 解压 hadoop-2.7.7.tar.gz
[hadoop@hufu01 ~]$ tar hadoop-2.7.7.tar.gz -C /home/hadoop
6.2 配置hadoop
<!-- 1. core-site.xml -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://hufu01:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/hadoop/hadoop/tmp</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>hadoop</value>
</property>
</configuration>
<!-- 2. core-site.xml -->
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop/hadoop/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hadoop/hadoop/tmp/dfs/data</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hufu01:9001</value>
</property>
</configuration>
<!-- 3. yarn-site.xml -->
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hufu01</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
</configuration>
<!-- 4. mapred-site.xml -->
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
<!-- 5. slaves -->
hufu02
hufu03
hufu04
6.3 修改hadoop依赖的jdk环境路径
[hadoop@hufu01 hadoop]$ vim hadoop-env.sh
24 # The java implementation to use.
25 export JAVA_HOME=/app/jdk
6.4 配置环境变量(两种配置:/etc/profile里面配置和 .bashrc里面配置)并使其生效
## 配置变量
[root@hufu01 hadoop]$ vim /etc/profile
#hadoop environment vars
export HADOOP_HOME=/home/hadoop/hadoop-2.7.7
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
## 生效
[root@hufu01 hadoop]$ source /etc/profile
7.将修改好的hadoop和环境变量的文件发到hufu02,hufu03,hufu04,并关闭四台服务器的防火墙
8.格式化hdfs文件系统
hadoop namenode -format
9.启动
## 1.启动hdfs
[hadoop@hufu01 root]$ start-dfs.sh
Starting namenodes on [hufu01]
hufu01: starting namenode, logging to /home/hadoop/hadoop-2.7.7/logs/hadoop-hadoop-namenode-hufu01.out
hufu04: starting datanode, logging to /home/hadoop/hadoop-2.7.7/logs/hadoop-hadoop-datanode-hufu04.out
hufu02: starting datanode, logging to /home/hadoop/hadoop-2.7.7/logs/hadoop-hadoop-datanode-hufu02.out
hufu03: starting datanode, logging to /home/hadoop/hadoop-2.7.7/logs/hadoop-hadoop-datanode-hufu03.out
Starting secondary namenodes [hufu01]
hufu01: starting secondarynamenode, logging to /home/hadoop/hadoop-2.7.7/logs/hadoop-hadoop-secondarynamenode-hufu01.out
##2. 启动yarn
[hadoop@hufu01 root]$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/hadoop-2.7.7/logs/yarn-hadoop-resourcemanager-hufu01.out
hufu03: starting nodemanager, logging to /home/hadoop/hadoop-2.7.7/logs/yarn-hadoop-nodemanager-hufu03.out
hufu02: starting nodemanager, logging to /home/hadoop/hadoop-2.7.7/logs/yarn-hadoop-nodemanager-hufu02.out
hufu04: starting nodemanager, logging to /home/hadoop/hadoop-2.7.7/logs/yarn-hadoop-nodemanager-hufu04.out
10.验证是否启动成功
##hufu01
[hadoop@hufu01 hadoop]$ jps
9409 Jps
8117 ResourceManager
7560 SecondaryNameNode
7357 NameNode
##hufu02
[root@hufu02 ~]# jps
8160 NodeManager
10020 Jps
9926 DataNode
##hufu03
[root@hufu03 ~]# jps
7714 NodeManager
9222 DataNode
9304 Jps
##hufu04
[hadoop@hufu04 root]$ jps
8533 DataNode
7710 NodeManager
8623 Jps