centos7安装hadoop
三台主机,分被担任master、slave1、slave2的角色
1、修改三台主机的主机名、hosts文件
hostnamectl set-hostname 主机名
vi /etc/hosts
2、安装jdk环境,本次安装java-1.8.0版本(也可使用源码安装jdk环境)
yum -y install epel-release
yum -y install java-1.8.0 java-1.8.0-devel
3、三台主机之间做免密登录
Hadoop的master与slave之间的数据传输会使用SSH,因此我们还需要对Hadoop环境所用的系统主机设置SSH免密码登录,注意:自己对自己也要设置免密登录。
4、下载hadoop
在apache官网(http://mirrors.hust.edu.cn/apache/)下载hadoop,本次实验下载版本为hadoop-3.1.1.tar.gz
5、部署hadoop(三台)
tar -xzvf hadoop-3.1.1.tar.gz
mv hadoop-3.1.1 /opt/hadoop
6、编辑配置文件
vi /opt/hadoop/etc/hadoop/core-site.xml ##三台都编辑
<configuration>
<property>
<name>fs.defaultFS</name>
<!-- hostname or ip of current node,eg. below is the configuration of namenode. -->
<value>hdfs://hadoop-master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<!-- location to store temporary files -->
<value>/tmp</value>
</property>
</configuration>
vi /opt/hadoop/etc/hadoop/hadoop-env.sh ##三台都编辑
export JAVA_HOME=/usr
export HADOOP_HOME=/opt/hadoop
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
vi /opt/hadoop/etc/hadoop/hdfs-site.xml ##只修改master
<configuration>
<property>
<name>dfs.namenode.http-address</name>
<!-- config the hostname and port which can be searched n browser to check the system detail-->
<value>hadoop-master:50070</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/var/hadoop/name</value>
</property>
<property>
<name>dfs.replication</name>
<!-- config times of backup -->
<value>2</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/var/hadoop/data</value>
</property>
</configuration>
vi /opt/hadoop/etc/hadoop/mapred-site.xml ##只修改master
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
vi /opt/hadoop/etc/hadoop/workers ##只修改master
注意:这里可以只设置slave1和slave2,这样master系统就不会作为DataNode节点
hadoop-master
hadoop-slave1
hadoop-slave2
vi /opt/hadoop/etc/hadoop/yarn-site.xml ##只修改master
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.hostname</name>
<!-- hostname or ip -->
<value>hadoop-master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-service-.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<!-- numbers of CPU, based on your system -->
<value>1</value>
</property>
</configuration>
7、初始化并启动服务
/opt/hadoop/bin/hdfs namenode -format ##初始化hadoop服务
/opt/hadoop/sbin/start-all.sh ##启动服务
使用浏览器访问http://hadoop-master:50070可访问,即为hadoop安装成功
测试:
hadoop集群安装完成后,可以使用命令进行简单的可用性测试
查看集群状态
hadoop dfsadmin –report
查看hdfs目录
hadoop fs -ls /
上传文件到hdfs
hadoop fs -put /root/ceshi.txt /
下载hdfs中文件
hadoop fs -get /ceshi.txt
在hdfs中创建目录
hadoop fs -mkdir -p /books/txt