一.准备环境
1.准备工具
- jdk-8u91-linux-x64.rpm
- hadoop-2.7.3.tar.gz
- Xshell,Xftp, virtualBox
- 虚拟机CentOS 7,主机windows 10
2.环境搭建
1.master装机,注意勾选最小服务器配置,和virtualBox中设置网络为(仅主机(host-only)网络),后面有软件要更新所以又换成了桥接模式。
2.启动master,更改主机名,更改ip,关闭防火墙
hostnamectl set-hostname master
vim /etc/sysconfig/network-scripts/ifcfg-enp0s3
IPADDR=192.168.56.100
NETMASK=255.255.255.0
vim /etc/sysconfig/network
NETWORKING=yes
GATEWAY=192.168.56.1
service network restart
vi /etc/resolv.conf
nameserver 8.8.8.8
启动: systemctl start firewalld
关闭: systemctl stop firewalld
查看状态: systemctl status firewalld
开机禁用 : systemctl disable firewalld
开机启用 : systemctl enable firewalld
3.ping通主机与虚拟机,连接xshell,xftp,通过xftp向master传jdk和hadoop到 /usr/local 文件夹
4.安装jdk和hadoop
rpm -ivh jdk-8u91-linux-x64.rpm #其中i表示安装,v表示显示安装过程,h表示显示进度
tar -zxvf hadoop-2.7.3.tar.gz
mv hadoop-2.7.3 hadoop
5.更改hadoop配置文件
- 1.更改hadoop-env.sh
cd hadoop/etc/hadoop
vim hadoop-env.sh
export JAVA_HOME=/usr/java/default
- 2.更改slaves
slave1
slave2
slave3
- 3.更改core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/var/hadoop</value>
</property>
<property>
<name>mapreduce.app-submission.cross-platform</name>
<value>true</value>
</property>
</configuration>
- 4.更改hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:50090</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/hdfs/data</value>
</property>
<property>
<name>dfs.namenode.heartbeat.recheck-interval</name>
<value>10000</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
</configuration>
- 5.更改mapred-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
</configuration>
- 6.yarn-site.xml
<?xml version="1.0"?>
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandle</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
</configuration>
6.复制3个slave,改IP和主机名,加hostname
vim /etc/hosts
192.168.56.100 master # namenode
192.168.56.101 slave1 # datanode
192.168.56.102 slave2 # datanode
192.168.56.103 slave3 # datanode
7.hadoop格式化然后启动hadoop
hdfs namenode -format
start-dfs.sh
start-yarn.sh
mr-jobhistory-daemon.sh start historyserver