系统准备
- 一共三台机器 hadoop-01、hadoop-02、hadoop-02
- hadoop-01 做 NameNode(active); hadoop-02 做 NameNode(standby)
- hadoop-02 做 ResourceManager(active); hadoop-03 做 ResourceManager(standby);
- 三台机器都充当 DataNode 和 NodeManger
- 三台机器都充当 JournalNode
- 三台机器执行
yum install -y snappy
Hadoop 安装和配置
-
Hadoop 安装
上传软件,将其解压值 ~/app目录 tar -zxvf hadoop-{versionid}.tar.gz -
Hadoop 配置
Hadoop安装完成后,需要对主要的几个配置文件进行配置,如下: ${HADOOP_HOME}/etc/hadopp/core-site.xml ${HADOOP_HOME}/etc/hadopp/hdfs-site.xml ${HADOOP_HOME}/etc/hadopp/yarn-site.xml ${HADOOP_HOME}/etc/hadopp/slaves 【配置参数中数据目录路径需要手动提前建立】针对 core-site.xml 文件:
<configuration> <!-- #################### need modify #################### --> <property> <name>hadoop.tmp.dir</name> <value>路径/hadoop/tmp</value> <description>A base for other temporary directories.</description> </property> <!-- #################### need modify #################### --> <property> <name>fs.defaultFS</name> <value>hdfs://hyz</value> <!--HA 模式下使用--> </property> <property> <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec</value> </property> <property> <name>ipc.client.connect.timeout</name> <value>120000</value> </property> <property> <name>ha.failover-controller.cli-check.rpc-timeout.ms</name> <value>120000</value> </property> </configuration>针对 hdfs-site.xml 文件:
Hadoop官方提供了两种 HDFS 的 HA 配置: 一种基于QJM实现; 一种基于NFS实现
QJM:the Quorum Journal Manager,翻译是法定经济管理人,实在没法想象,所以大家都亲切的称之为QJM。这种方案是通过JournalNode共享EditLog的数据,使用的是Paxos算法(没错,zookeeper就是使用的这种算法),保证活跃的NameNode与备份的NameNode之间EditLog日志一致。
NFS:Network File System 或 Conventional Shared Storage,传统共享存储,其实就是在服务器挂载一个网络存储(比如NAS),活跃NameNode将EditLog的变化写到NFS,备份NameNode检查到修改就读取过来,是两个NameNode数据一致。部署上:QJM需要启动几个JournalNode即可,NFS需要挂载一个共享存储
配置文件上: hdfs-site.xml的内容唯一的区别是 dfs.namenode.shared.edits.dir 属性的配置
本文采用 QJM 的方法配置 hdfs 的 HA 模式,hdfs-site.xml 内容如下:
<configuration>
<!-- #################### need modify #################### -->
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///路径/hadoop/data/datanode</value>
</property>
<!-- #################### need modify #################### -->
<!-- #################### namenode ha #################### -->
<property>
<name>dfs.nameservices</name>
<value>hyz</value>
</property>
<property>
<name>dfs.ha.namenodes.hyz</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.hyz.nn1</name>
<value>hadoop-01:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.hyz.nn2</name>
<value>hadoop-02:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.hyz.nn1</name>
<value>hadoop-01:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.hyz.nn2</name>
<value>hadoop-02:50070</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.hyz.nn1</name>
<value>hadoop-01:53310</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.hyz.nn2</name>
<value>hadoop-02:53310</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop-01:8485;hadoop-02:8485;hadoop-03:8485/hyz</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>路径/data/ha/journal</value>
</property>
<property>
<name>dfs.journalnode.rpc-address</name>
<value>0.0.0.0:8485</value>
</property>
<property>
<name>dfs.journalnode.http-address</name>
<value>0.0.0.0:8482</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.hyz</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>路径/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>10000</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>zkF01:2181,zkF02:2181,zkF03:2181</value>
</property>
<!-- #################### namenode ha #################### -->
<!-- #################### namenode #################### -->
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///路径/hadoop/data/namenod</value>
</property>
<property>
<name>dfs.namenode.acls.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.namenode.handler.count</name>
<value>50</value>
<description>The number of server threads for the namenode.</description>
</property>
<!-- #################### namenode #################### -->
<!-- #################### datanode #################### -->
<property>
<name>dfs.datanode.handler.count</name>
<value>20</value>
<description>The number of server threads for the datanode.</description>
</property>
<property>
<name>dfs.datanode.max.xcievers</name>
<value>8192</value>
</property>
<property>
<name>dfs.datanode.socket.write.timeout</name>
<value>480000</value>
</property>
<property>
<name>dfs.datanode.hdfs-blocks-metadata.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.datanode.failed.volumes.tolerated</name>
<value>0</value>
</property>
<!-- #################### datanode #################### -->
<property>
<name>dfs.permissions.enabled</name>
<value>true</value>
<description>If "true", enable permission checking in HDFS. If "false", permission checking is turned off, but all other behavior is unchanged. Switching from one parameter value to the other does not change the mode, owner or group of files or directories.</description>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.replication.min</name>
<value>3</value>
<description>Minimal block replication.</description>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.support.append</name>
<value>true</value>
<description>set if hadoop support append</description>
</property>
<property>
<name>fs.checkpoint.period</name>
<value>60</value>
<description>set if hadoop support append</description>
</property>
<property>
<name>dfs.balance.bandwidthPerSec</name>
<value>10485760</value>
<description>Specifies the maximum bandwidth that each datanode can utilize for the balancing purpose in term of the number of bytes per second.</description>
</property>
<property>
<name>fs.hdfs.impl.disable.cache</name>
<value>false</value>
</property>
<property>
<name>dfs.socket.timeout</name>
<value>1800000</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>1440</value>
<description>Number of minutes between trash checkpoint. if zero, the trash feature is disabled.</description>
</property>
<property>
<name>dfs.blocksize</name>
<value>134217728</value>
</property>
</configuration>
针对 yarn-site.xml 文件:
<configuration>
<!-- #################### need modify #################### -->
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>CPU核数</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>Mem大小</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>路径/yarn/yarnlocal</value>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>路径/yarn/yarnlog</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>路径/yarn/remote-app-logs</value>
</property>
<!-- #################### need modify #################### -->
<!-- #################### resource manager ha #################### -->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>yarn-cluster</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.ha.id</name>
<value>rm1</value> <!--注意修改在rm2的机器上 -->
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>zkF01:2181,zkF02:2181,zkF03:2181</value>
</property>
<property>
<name>yarn.resourcemanager.zk.state-store.address</name>
<value>zkF01:2181,zkF02:2181,zkF03:2181</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>zkF01:2181,zkF02:2181,zkF03:2181</value>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>
<value>5000</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop-02:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop-02:19888</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<!-- RM1 configs -->
<property>
<name>yarn.resourcemanager.address.rm1</name>
<value>hadoop-02:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.rm1</name>
<value>hadoop-02:8030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>hadoop-02:8088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.rm1</name>
<value>hadoop-02:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address.rm1</name>
<value>hadoop-02:8033</value>
</property>
<property>
<name>yarn.resourcemanager.ha.admin.address.rm1</name>
<value>hadoop-02:23142</value>
</property>
<!-- RM2 configs -->
<property>
<name>yarn.resourcemanager.address.rm2</name>
<value>hadoop-03:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.rm2</name>
<value>hadoop-03:8030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>hadoop-03:8088</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.rm2</name>
<value>hadoop-03:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address.rm2</name>
<value>hadoop-03:8033</value>
</property>
<property>
<name>yarn.resourcemanager.ha.admin.address.rm2</name>
<value>hadoop-03:23142</value>
</property>
<!-- #################### resource manager ha #################### -->
<!-- #################### node manager #################### -->
<!-- Node Manager Configs -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<description>Address where the localizer IPC is.</description>
<name>yarn.nodemanager.localizer.address</name>
<value>0.0.0.0:8040</value>
</property>
<property>
<description>NM Webapp address.</description>
<name>yarn.nodemanager.webapp.address</name>
<value>0.0.0.0:8042</value>
</property>
<property>
<name>yarn.nodemanager.log.retain-seconds</name>
<value>10800</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>43200</value>
</property>
<property>
<name>yarn.log-aggregation.retain-check-interval-seconds</name>
<value>7200</value>
</property>
<!-- #################### node manager #################### -->
</configuration>
针对 slaves 文件:
hadoop-01 hadoop-02 hadoop-03
Hadoop 启动
-
${HADOOP_HOME}/sbin/hadoop-env.sh文件中,手动修改export JAVA_HOME = /opt/jdk (要绝对路径) - nn1
${HADOOP_HOME}/bin/hdfs zkfc -formatZK
- nn1/nn2/nn3
${HADOOP_HOME}/sbin/hadoop-daemon.sh start journalnode
- nn1
${HADOOP_HOME}/bin/hdfs namenode -format${HADOOP_HOME}/sbin/hadoop-daemon.sh start namenode
- nn2
${HADOOP_HOME}/bin/hdfs namenode -bootstrapStandby${HADOOP_HOME}/sbin/hadoop-daemon.sh start namenode
- nn1
${HADOOP_HOME}/sbin/hadoop-daemons.sh start datanode
- nn1/nn2
${HADOOP_HOME}/sbin/hadoop-daemon.sh start zkfc
- nn2/nn3
${HADOOP_HOME}/sbin/start-yarn.sh
- nn2
${HADOOP_HOME}/sbin/mr-jobhistory-daemon.sh start historyserver
Hadoop 检查
-
jps命令检查进程情况
hadoop-01的线程情况.png
hadoop-02的线程情况.png
hadoop-03的线程情况.png
- 浏览器输入
http://hadoop-01:50070查看 NameNode 状态
namenode状态.png



