一. Jobhistory的作用
- 在浏览器中查看每个Job运行之后的历史信息,包括Map个数、Reduce个数、Task运行日志等;
- 日志聚合到一个节点的HDFS下,方便查看;
二. 如何配置
1. mapred-site.xml文件配置
- 配置文件所在路径
[root@hadoop003 hadoop]# ll
total 152
-rw-r--r-- 1 root root 4436 Jan 31 14:54 capacity-scheduler.xml
-rw-r--r-- 1 root root 1335 Jan 31 14:54 configuration.xsl
-rw-r--r-- 1 root root 318 Jan 31 14:54 container-executor.cfg
-rw-r--r-- 1 root root 1886 Feb 5 09:35 core-site.xml
-rw-r--r-- 1 root root 3589 Jan 31 14:54 hadoop-env.cmd
-rw-r--r-- 1 root root 4238 Jan 31 14:54 hadoop-env.sh
-rw-r--r-- 1 root root 2598 Jan 31 14:54 hadoop-metrics2.properties
-rw-r--r-- 1 root root 2490 Jan 31 14:54 hadoop-metrics.properties
-rw-r--r-- 1 root root 9683 Jan 31 14:54 hadoop-policy.xml
-rw-r--r-- 1 root root 3547 Feb 5 10:22 hdfs-site.xml
-rw-r--r-- 1 root root 1449 Jan 31 14:54 httpfs-env.sh
-rw-r--r-- 1 root root 1657 Jan 31 14:54 httpfs-log4j.properties
-rw-r--r-- 1 root root 21 Jan 31 14:54 httpfs-signature.secret
-rw-r--r-- 1 root root 620 Jan 31 14:54 httpfs-site.xml
-rw-r--r-- 1 root root 3518 Jan 31 14:54 kms-acls.xml
-rw-r--r-- 1 root root 1527 Jan 31 14:54 kms-env.sh
-rw-r--r-- 1 root root 1631 Jan 31 14:54 kms-log4j.properties
-rw-r--r-- 1 root root 5511 Jan 31 14:54 kms-site.xml
-rw-r--r-- 1 root root 11237 Jan 31 14:54 log4j.properties
-rw-r--r-- 1 root root 931 Jan 31 14:54 mapred-env.cmd
-rw-r--r-- 1 root root 1383 Jan 31 14:54 mapred-env.sh
-rw-r--r-- 1 root root 4113 Jan 31 14:54 mapred-queues.xml.template
-rw-r--r-- 1 root root 1479 Feb 2 10:41 mapred-site.xml
-rw-r--r-- 1 root root 30 Jan 31 14:54 slaves
-rw-r--r-- 1 root root 2316 Jan 31 14:54 ssl-client.xml.example
-rw-r--r-- 1 root root 2268 Jan 31 14:54 ssl-server.xml.example
-rw-r--r-- 1 root root 2191 Jan 31 14:54 yarn-env.cmd
-rw-r--r-- 1 root root 4567 Jan 31 14:54 yarn-env.sh
-rw-r--r-- 1 root root 2276 Feb 2 10:50 yarn-site.xml
[root@hadoop003 hadoop]# pwd
/opt/module/hadoop-2.7.3/etc/hadoop
- 修改配置
[root@hadoop003 hadoop]# cat mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<!--正在运行中的日志在HDFS上的存放路径-->
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>/hadoop/history/done_intermediate</value>
</property>
<!--运行过的日志存放在HDFS上的存放路径-->
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>/hadoop/history/done</value>
</property>
<!--配置web端口-->
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop003:19888</value>
</property>
<!--设置jobhistoryserver服务器地址及对应端口-->
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop003:10020</value>
</property>
</configuration>
- 拷贝到其它节点上
scp yarn-site.xml hadoop001:`pwd`
scp yarn-site.xml hadoop002:`pwd`
三. 启动Jobhistory
1. 重启Yarn,因为修改了yarn-site.xml文件
[root@hadoop003 module]# stop-yarn.sh
stopping yarn daemons
stopping resourcemanager
hadoop002: stopping nodemanager
hadoop003: stopping nodemanager
hadoop001: stopping nodemanager
no proxyserver to stop
[root@hadoop003 module]# jps
16768 QuorumPeerMain
20163 Jps
17157 DataNode
17049 JournalNode
[root@hadoop003 module]# start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/module/hadoop-2.7.3/logs/yarn-hadoop-resourcemanager-hadoop003.out
hadoop002: starting nodemanager, logging to /opt/module/hadoop-2.7.3/logs/yarn-root-nodemanager-hadoop002.out
hadoop001: starting nodemanager, logging to /opt/module/hadoop-2.7.3/logs/yarn-root-nodemanager-hadoop001.out
hadoop003: starting nodemanager, logging to /opt/module/hadoop-2.7.3/logs/yarn-root-nodemanager-hadoop003.out
[root@hadoop003 module]# jps
16768 QuorumPeerMain
20738 Jps
20291 ResourceManager
17157 DataNode
20405 NodeManager
17049 JournalNode
[root@hadoop003 sbin]# mr-jobhistory-daemon.sh start historyserver
starting historyserver, logging to /opt/module/hadoop-2.7.3/logs/mapred-hadoop-historyserver-hadoop003.out
- 查看java进程,新出现一个Jobhistoryserver进程
[root@hadoop003 sbin]# jps
25200 JobHistoryServer
25296 Jps
2291 QuorumPeerMain
11907 JournalNode
11796 DataNode
20824 Worker
23017 NodeManager
22907 ResourceManager
四. 查看Job、Task日志
1. 打开Yarn资源管理界面
image.png
2. 点击Job ID
image.png
3. 点击History
image.png