Hadoop HA 配置完成后, 确实NameNode的稳定性得到了保障,不过,问题接踵而来,那就是NameNode压力过大。随着数据的越来越多,NameNode的压力越来越大,timeout出现的概率越来越多,如果能够多创建几个NameNode来提供服务,确实,能够减缓压力!因此,Hadoop又引入了Federation机制!
如果你在你的Hadoop集群上搭建了HBase集群,那么第一个想拆分的NameNode就是HBase,此时nameService变成了: hadoop和hbase
一个NameNode的配置首先是要配置其
首先是在原来的HDFS上面修改hdfs-site.xml,添加新的 nameservices : hbase
<property>
<name>dfs.nameservices</name>
<value>hadoop,hbase</value>
<description>Comma-separated list of nameservices.</description>
</property>
并添加HBase nameservice 具体配置:
<property>
<name>dfs.ha.namenodes.hbase</name>
<value>nn4,nn5</value>
<description>
The prefix for a given nameservice, contains a comma-separated
list of namenodes for a given nameservice (eg EXAMPLENAMESERVICE).
</description>
</property>
<property>
<name>dfs.namenode.rpc-address.hbase.nn4</name>
<value>nn4ss:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.hbase.nn5</name>
<value>nn5ss:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.hbase.nn4</name>
<value>nn4ss:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.hbase.nn5</name>
<value>nn5ss:50070</value>
</property>
当把一个NameNode拆分成多个NameNode之后,上面需要设置一个统一的viewFS,否则每次需要指定相应的nameService,为了在ls时候能不指定nameService,则需要在所有的nameService上加上统一的viewFS,同时要修改core-site.xml中的fs.defaultFS,使得默认情况下指向统一的viewFS。
修改core-site.xml中fs.defaultFS为viewfs://Federation/
<property>
<name>fs.defaultFS</name>
<value>viewfs://Federation/</value>
<description>Specifies the NameNode and the default file system,
in the form hdfs://namenode-host:namenode-port/. The default value is file///.
The default file system is used to resolve relative paths; for example, if fs.default.name or fs.defaultFS
is set to hdfs://mynamenode/, the relative URI /mydir/myfile resolves to hdfs://mynamenode/mydir/myfile.
Note: for the cluster to function correctly, the namenode part of the string
must be the hostname (for example mynamenode) not the IP address.</description>
</property>
同时,当我们指定了统一的viewFS后,下面的路径需要于真实的nameService中的路径相对应,则需要创建mountTable.xml文件,并指明viewFS下的路径与nameService中的路径的对应的关系。
在core-site.xml中添加:
<xi:include href="/etc/hadoop/conf/mountTable.xml" />
创建mounTable.xml,并指明viewFS下的路径与nameService中的路径的对应的关系
<property>
<name>fs.viewfs.mounttable.Federation.homedir</name>
<value>/user</value>
</property>
<property>
<name>fs.viewfs.mounttable.Federation.link./user</name>
<value>hdfs://hadoop/user</value>
</property>
<property>
<name>fs.viewfs.mounttable.Federation.link./data</name>
<value>hdfs://hadoop/data</value>
</property>
<property>
<name>fs.viewfs.mounttable.Federation.link./tmp</name>
<value>hdfs://hadoop/tmp</value>
</property>
<property>
<name>fs.viewfs.mounttable.Federation.link./hbase</name>
<value>hdfs://hbase/hbase</value>
</property>
fs.viewfs.mounttable.Federation.link./hbase 指定的是HBase在HDFS上的路径,因为我们将HBase从之前单一的NameNode中拆分出来,并提供了新的NameNode(nn4,nn5),所以我们在这边将viewFS下的hbase对应到nameService为hbase的路径下。
此时,nameService为hadoop,即第一个NameNode的配置成功了。你还想配置第二个nameService,首先在新的nameService(nn4,nn5)上安装hadoop,并将之前配置好的文件拷贝过来。然后更改新的NameNode所依赖的信息。
因为每个NameNode的QJM和HA Zookeeper都是不一样的,因此需要根据每个NameNode的情况来设置各自的QJM和HA Zookeeper。
例如修改HBase nameService的配置:
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://nn4ss:8485;nn5ss:8485;nn6ss:8485/hbaseQjournal</value>
<description>A directory on shared storage between the multiple namenodes
in an HA cluster. This directory will be written by the active and read
by the standby in order to keep the namespaces synchronized. This directory
does not need to be listed in dfs.namenode.edits.dir above. It should be
left empty in a non-HA cluster.
</description>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>nn4ss:2181,nn5ss:2181,nn6ss:2181</value>
<description>
A list of ZooKeeper server addresses, separated by commas, that are
to be used by the ZKFailoverController in automatic failover.
</description>
</property>
参考:
https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/Federation.html
http://blog.csdn.net/skywalker_only/article/details/40373643