前言
因公司需要验证FlinkCDC的能力,而且要求按照最终的生产模式去部署,因此对安装过程做个记录,小白请多指教~
一、集群规划
hadoop01(Master + Slave):JobManager + TaskManager
hadoop02(Master + Slave):JobManager + TaskManager
hadoop03(Slave):TaskManager
二、部署Flink集群
1.版本选择
地址:https://flink.apache.org/zh/downloads.html
版本:flink-1.14.4-bin-scala_2.12.tgz
2.上传到hadoop01 主机
略
3.解压安装包
cd /home/hadoop/plat/flink
tar -zxvf flink-1.14.4-bin-scala_2.12.tgz
4.修改配置文件
4.1 flink-conf.yaml
vi /home/hadoop/plat/flink/flink-1.14.4/conf/flink-conf.yaml
基础配置
# jobManager 的IP地址
jobmanager.rpc.address: hadoop01
# JobManager 的端口号
jobmanager.rpc.port: 6123
# JobManager JVM heap 内存大小
jobmanager.memory.process.size: 1600m
# TaskManager JVM heap 内存大小
taskmanager.memory.process.size: 1728m
# 每个 TaskManager 提供的任务 slots 数量大小
taskmanager.numberOfTaskSlots: 2
# 程序默认并行计算的个数
parallelism.default: 1
#支持Web提交Application
web.submit.enable: true
高可用配置信息
# 使用zookeeper搭建高可用
high-availability: zookeeper
# JobManager metadata 的存储路径
high-availability.storageDir: hdfs:///flink/ha/
# 配置ZK集群地址
high-availability.zookeeper.quorum: hadoop01:21001,hadoop02:21001,hadoop03:21001
# 保存 存储和检查点状态的模式
state.backend: filesystem
#启用 checkpoints 存储 checkpoints 的数据文件和元数据的默认目录
state.backend.fs.checkpointdir: hdfs://hadoop01:9000/flink/flink-checkpoints
#启用检查点,存savepoints的数据文件和元数据的默认目录 (可选)
state.backend.fs.checkpointdir: hdfs://hadoop01:9000/flink/flink-savepoints
# 基于 Region 的局部重启
jobmanager.execution.failover-strategy: region
#支持Web提交Application
web.submit.enable: true
HistoryServer
# 保存已完成的作业的目录
jobmanager.archive.fs.dir: hdfs:///hadoop01:9000/flink/completed-jobs/
#基于 Web 的 HistoryServer 的地址
historyserver.web.address: hadoop01
# 基于 Web 的 HistoryServer 的端口号
historyserver.web.port: 8082
# 以逗号分隔的目录列表,用于监视已完成的作业
historyserver.archive.fs.dir: hdfs:///hadoop01:9000/flink/completed-jobs/
# 刷新受监控目录的时间间隔(以毫秒为单位)
historyserver.archive.fs.refresh-interval: 10000
4.2 masters
vi /home/hadoop/plat/flink/flink-1.14.4/conf/masters
hadoop01:8081
4.3 workers
vi /home/hadoop/plat/flink/flink-1.14.4/conf/workers
hadoop01
hadoop02
hadoop03
3.启动Flink集群
1.环境变量配置
### Flink
export FLINK_HOME=/home/hadoop/plat/flink/flink-1.14.4
export PATH=${FLINK_HOME}/bin:${PATH}
start-cluster.sh
三、问题记录
1.启动后jps没有进程,没有起来
查看master节点日志 /home/hadoop/plat/flink/flink-1.14.4/log/flink-hadoop-standalonesession-2-hadoop01.log 有如下报错信息
解决方案:下载flink-shaded-hadoop-3-uber-3.1.1.7.2.9.0-173-9.0.jar放到${FLINK_HOME}/lib
地址:https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-hadoop-3-uber?repo=cloudera-repos
2 日志有报错
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.commons.cli.Option.builder(Ljava/lang/String;)Lorg/apache/commons/cli/Option$Builder;
解决方案:下载commons-cli-1.5.0.jar放到${FLINK_HOME}/lib
地址:https://mvnrepository.com/artifact/commons-cli/commons-cli
3 报错
执行 flink run -m yarn-cluster ./examples/batch/WordCount.jar
报错信息如下:
Exception in thread "Thread-5" java.lang.IllegalStateException: Trying to access closed classloader. Please check if you store classloaders directly or indirectly in static fields. If the stacktrace suggests that the leak occurs in a third party library and cannot be fixed immediately, you can disable this check with the configuration 'classloader.check-leaked-classloader'.
参考帖子
https://blog.csdn.net/wtl1992/article/details/121307695
https://blog.csdn.net/u013982921/article/details/96428258
https://blog.csdn.net/momentni/article/details/114637659