Flink On Yarn 模式部署提交
一、环境准备
Ubuntu
hadoop 2.6.0(官网下载)
Flink 1.12.2
jdk 8
二、Hadoop 完全分布式-yarn配置
永久关闭防火墙
-
修改主机名
vim /etc/hosts
192.168.73.130 hadoop01
-
修改环境变量
export JAVA_HOME=/usr/lib/jdk export HADOOP_HOME=/home/ad/hadoop-2.6.0 export HADOOP_PREFIX={HADOOP_HOME}/bin/hadoop classpath` export PATH=JAVA_HOME/bin:HADOOP_HOME/sbin:$FLINK_HOME/bin
环境变量生效
$ source /etc/profile
验证
$ hadoop version
Hadoop 2.6.0 Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1 Compiled by jenkins on 2014-11-13T21:10Z Compiled with protoc 2.5.0 From source with checksum 18e43357c8f927c0695f1e9522859d6a This command was run using /home/ad/hadoop-2.6.0/share/hadoop/common/hadoop-common-2.6.0.jar
SSH免密登录
修改配置文件
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop01:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop/tmp</value>
</property>
</configuration>
hadoop-env.sh mapred-env.sh yarn-env.sh
修改$JAVA_HOME 路径
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value> </property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop01</value>
</property>
</configuration>
cp mapred-site.xml.templat mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
slaves
hadoop01
- hadoop01上格式化
hadoop namenode -format
- 启动hadoop集群
$ start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh Starting namenodes on [hadoop01] hadoop01: starting namenode, logging to /home/ad/hadoop-2.6.0/logs/hadoop-root-namenode-ad-virtual-machine.out hadoop01: starting datanode, logging to /home/ad/hadoop-2.6.0/logs/hadoop-root-datanode-ad-virtual-machine.out Starting secondary namenodes [0.0.0.0] 0.0.0.0: starting secondarynamenode, logging to /home/ad/hadoop-2.6.0/logs/hadoop-root-secondarynamenode-ad-virtual-machine.out starting yarn daemons starting resourcemanager, logging to /home/ad/hadoop-2.6.0/logs/yarn-root-resourcemanager-ad-virtual-machine.out hadoop01: starting nodemanager, logging to /home/ad/hadoop-2.6.0/logs/yarn-root-nodemanager-ad-virtual-machine.out
- 访问hadoop01:8080
三、验证hadoop yarn
创建HDFS数据目录
创建一个目录,用于保存MapReduce任务的输入文件:
hadoop fs -mkdir -p /data/wordcount1
创建一个目录,用于保存MapReduce任务的输出文件:
hadoop fs -mkdir /output1
查看刚刚创建的两个目录:
hadoop fs -ls /
drwxr-xr-x - root supergroup 0 2017-09-01 20:34 /data
drwxr-xr-x - root supergroup 0 2017-09-01 20:35 /output1
(3)创建一个单词文件,并上传到HDFS
创建的单词文件如下:
cat myword.txt
leaf yyh
yyh xpleaf
katy ling
yeyonghao leaf
xpleaf katy1.2.3.4.5.6.</pre>
上传该文件到HDFS中:
hadoop fs -put myword.txt /data/wordcount1
在HDFS中查看刚刚上传的文件及内容:
hadoop fs -ls /data/wordcount
-rw-r--r-- 1 root supergroup 57 2017-09-01 20:40 /data/wordcount/myword.txt
hadoop fs -cat /data/wordcount/myword.txt
leaf yyh
yyh xpleaf
katy ling
yeyonghao leaf
xpleaf katy1.2.3.4.5.6.7.8.</pre>
(4)运行wordcount程序
执行如下命令:
登录后复制
hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.5.jar wordcount /data/wordcount /output/wordcount
...
17/09/01 20:48:14 INFO mapreduce.Job: Job job_local1719603087_0001 completed successfully
17/09/01 20:48:14 INFO mapreduce.Job: Counters: 38
File System Counters
FILE: Number of bytes read=585940
FILE: Number of bytes written=1099502
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=114
HDFS: Number of bytes written=48
HDFS: Number of read operations=15
HDFS: Number of large read operations=0
HDFS: Number of write operations=4
Map-Reduce Framework
Map input records=5
Map output records=10
Map output bytes=97
Map output materialized bytes=78
Input split bytes=112
Combine input records=10
Combine output records=6
Reduce input groups=6
Reduce shuffle bytes=78
Reduce input records=6
Reduce output records=6
Spilled Records=12
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=92
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=241049600
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=57
File Output Format Counters
Bytes Written=48</pre>
三、Flink on yarn环境搭建
-
Flink Session
略
Flink Per-job
$./bin/flink run -m yarn-cluster ./examples/batch/WordCount.jar
SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/ad/flink/flink-1.12.2/lib/log4j-slf4j-impl-2.12.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/ad/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Executing WordCount example with default input data set. Use --input to specify file input. Printing result to stdout. Use --output to specify output path. 2021-08-14 11:17:35,074 WARN org.apache.flink.yarn.configuration.YarnLogConfigUtil [] - The configuration directory ('/home/ad/flink/flink-1.12.2/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file. 2021-08-14 11:17:35,122 INFO org.apache.hadoop.yarn.client.RMProxy [] - Connecting to ResourceManager at hadoop01/192.168.73.130:8032 2021-08-14 11:17:35,238 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar 2021-08-14 11:17:35,339 WARN org.apache.flink.yarn.YarnClusterDescriptor [] - Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink YARN Client needs one of these to be set to properly load the Hadoop configuration for accessing YARN. 2021-08-14 11:17:35,372 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - The configured JobManager memory is 1600 MB. YARN will allocate 2048 MB to make up an integer multiple of its minimum allocation memory (1024 MB, configured via 'yarn.scheduler.minimum-allocation-mb'). The extra 448 MB may not be used by Flink. 2021-08-14 11:17:35,373 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - The configured TaskManager memory is 1728 MB. YARN will allocate 2048 MB to make up an integer multiple of its minimum allocation memory (1024 MB, configured via 'yarn.scheduler.minimum-allocation-mb'). The extra 320 MB may not be used by Flink. 2021-08-14 11:17:35,374 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Cluster specification: ClusterSpecification{masterMemoryMB=1600, taskManagerMemoryMB=1728, slotsPerTaskManager=4} 2021-08-14 11:17:39,080 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Submitting application master application_1628910991546_0001 2021-08-14 11:17:39,472 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl [] - Submitted application application_1628910991546_0001 2021-08-14 11:17:39,472 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Waiting for the cluster to be allocated 2021-08-14 11:17:39,474 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Deploying cluster, current state ACCEPTED 2021-08-14 11:17:49,830 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - YARN application has been deployed successfully. 2021-08-14 11:17:49,833 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Found Web Interface ad-virtual-machine:36059 of application 'application_1628910991546_0001'. Job has been submitted with JobID addaa84fd2ee06164ba7d53a029a6342 Program execution finished Job with JobID addaa84fd2ee06164ba7d53a029a6342 has finished. Job Runtime: 12937 ms Accumulator Results:
- 23a767877a2b6289cf181a8732c5d46a (java.util.ArrayList) [170 elements]
(a,5) (action,1) (after,1) (against,1) (all,2) (and,12) (arms,1) (arrows,1) (awry,1) (ay,1) (bare,1) (be,4) (bear,3) (bodkin,1) (bourn,1) (but,1) (by,2) (calamity,1) (cast,1) (coil,1) (come,1) (conscience,1) (consummation,1) (contumely,1) (country,1) (cowards,1) (currents,1) (d,4) (death,2) (delay,1) (despis,1) (devoutly,1) (die,2) (does,1) (dread,1) (dream,1) (dreams,1) (end,2) (enterprises,1) (er,1) (fair,1) (fardels,1) (flesh,1) (fly,1) (for,2) (fortune,1) (from,1) (give,1) (great,1) (grunt,1) (have,2) (he,1) (heartache,1) (heir,1) (himself,1) (his,1) (hue,1) (ills,1) (in,3) (insolence,1) (is,3) (know,1) (law,1) (life,2) (long,1) (lose,1) (love,1) (make,2) (makes,2) (man,1) (may,1) (merit,1) (might,1) (mind,1) (moment,1) (more,1) (mortal,1) (must,1) (my,1) (name,1) (native,1) (natural,1) (no,2) (nobler,1) (not,2) (now,1) (nymph,1) (o,1) (of,15) (off,1) (office,1) (ophelia,1) (opposing,1) (oppressor,1) (or,2) (orisons,1) (others,1) (outrageous,1) (pale,1) (pangs,1) (patient,1) (pause,1) (perchance,1) (pith,1) (proud,1) (puzzles,1) (question,1) (quietus,1) (rather,1) (regard,1) (remember,1) (resolution,1) (respect,1) (returns,1) (rub,1) (s,5) (say,1) (scorns,1) (sea,1) (shocks,1) (shuffled,1) (sicklied,1) (sins,1) (sleep,5) (slings,1) (so,1) (soft,1) (something,1) (spurns,1) (suffer,1) (sweat,1) (take,1) (takes,1) (than,1) (that,7) (the,22) (their,1) (them,1) (there,2) (these,1) (this,2) (those,1) (thought,1) (thousand,1) (thus,2) (thy,1) (time,1) (tis,2) (to,15) (traveller,1) (troubles,1) (turn,1) (under,1) (undiscover,1) (unworthy,1) (us,3) (we,4) (weary,1) (what,1) (when,2) (whether,1) (whips,1) (who,2) (whose,1) (will,1) (wish,1) (with,3) (would,2) (wrong,1) (you,1)