KSQL是个好东西 ,了解得人较少,我也不多介绍,愿者上钩。喜欢它的,拿起来就是用。
KSQL集群的组件有点多,这里使用了zookeeper + kafka + ksql + kafka connect + kafka schema registry.
a. zookeeper跟kafka是最基础的消列队列集群组件。
b. ksql是一套流式计算引擎的集群
c. kafka connect是一套管理外部系统输入输出接口的集群
d. kafka schema registry 是一套多模式管理集群。
所以我们主要有以下过程:
- 创建三台虚拟机及其用户
- 打包(confluent-oss-5.0.0 & postgres)服务依赖
- 部署(confluent-oss-5.0.0 & postgres )至远程服务
- 启动(zookeeper & kafka & schem-registry & ksql & kafka connect & postgres)远程服务
- ETL之extract: 使用avro-console-producer 生成输入数据 (后续切换成文件 实时读取)
- ETL之tranform: 使用ksql对数据进行简单处理
- ETL之load: 使 用kafka connect输出数据至postgres
相关代码请参见: https://github.com/clojurians-org/my-env/tree/master/run.sh.d/ksql-example
第0步-准备工作: 创建三台虚拟机并创建用户op
[larluo@larluo-nixos:~/my-env]$ cat run.sh.d/ksql-example/createvm.sh
set -e
my=$(cd -P -- "$(dirname -- "${BASH_SOURCE-$0}")" > /dev/null && pwd -P) && cd $my/../..
echo -e "\n==== bash nix.sh create-vm nixos-ksql-001" && bash nix.sh create-vm nixos-ksql-001
echo -e "\n==== bash nix.sh create-vm nixos-ksql-002" && bash nix.sh create-vm nixos-ksql-002
echo -e "\n==== bash nix.sh create-vm nixos-ksql-003" && bash nix.sh create-vm nixos-ksql-003
第一步. 打包(confluent-oss-5.0.0 & postgres)服务依赖
由于 KSQL目前未集成到nixpkgs中,我们要么采用nix-build构建整合,或者自己打包依赖。这里我们采用后面一种方案,便于没有nix编程基础 的朋友使用, 后续提供nix-build整合方案。
gettext依赖为服务启动脚本调用命令envsubst的软件包
[larluo@larluo-nixos:~/my-env]$ cat run.sh.d/ksql-example/package.sh
set -e
my=$(cd -P -- "$(dirname -- "${BASH_SOURCE-$0}")" > /dev/null && pwd -P) && cd $my/../..
echo -e "\n==== bash nix.sh export tgz.nix-2.0.4" && bash nix.sh export tgz.nix-2.0.4
echo -e "\n==== bash nix.sh export tgz.confluent-oss-5.0.0" && bash nix.sh export tgz.confluent-oss-5.0.0
echo -e "\n==== bash nix.sh export nix.gettext-0.19.8.1" && bash nix.sh export nix.gettext-0.19.8.1
echo -e "\n==== bash nix.sh export nix.openjdk-8u172b11" && bash nix.sh export nix.openjdk-8u172b11
echo -e "\n==== bash nix.sh export nix.postgresql-10.4" && bash nix.sh export nix.postgresql-10.4
第二步: 部署(confluent-oss-5.0.0 & postgres )至远程服务
[larluo@larluo-nixos:~/my-env]$ cat run.sh.d/ksql-example/deploy.sh
set -e
my=$(cd -P -- "$(dirname -- "${BASH_SOURCE-$0}")" > /dev/null && pwd -P) && cd $my/../..
export my_rhome=/home/op/my-env
export my_user=op
echo -e "\n==== bash nix.sh create-user 192.168.56.101" && bash nix.sh create-user 192.168.56.101
echo -e "\n==== bash nix.sh create-user 192.168.56.102" && bash nix.sh create-user 192.168.56.102
echo -e "\n==== bash nix.sh create-user 192.168.56.103" && bash nix.sh create-user 192.168.56.103
echo -e "\n==== bash nix.sh install 192.168.56.101 tgz.nix-2.0.4" && bash nix.sh install 192.168.56.101 tgz.nix-2.0.4
echo -e "\n==== bash nix.sh install 192.168.56.102 tgz.nix-2.0.4" && bash nix.sh install 192.168.56.102 tgz.nix-2.0.4
echo -e "\n==== bash nix.sh install 192.168.56.103 tgz.nix-2.0.4" && bash nix.sh install 192.168.56.103 tgz.nix-2.0.4
echo -e "\n==== bash nix.sh import 192.168.56.101 tgz.confluent-oss-5.0.0" && bash nix.sh import 192.168.56.101 tgz.confluent-oss-5.0.0
echo -e "\n==== bash nix.sh import 192.168.56.102 tgz.confluent-oss-5.0.0" && bash nix.sh import 192.168.56.102 tgz.confluent-oss-5.0.0
echo -e "\n==== bash nix.sh import 192.168.56.103 tgz.confluent-oss-5.0.0" && bash nix.sh import 192.168.56.103 tgz.confluent-oss-5.0.0
echo -e "\n==== bash nix.sh install 192.168.56.101 nix.gettext-0.19.8.1" && bash nix.sh install 192.168.56.101 nix.gettext-0.19.8.1
echo -e "\n==== bash nix.sh install 192.168.56.102 nix.gettext-0.19.8.1" && bash nix.sh install 192.168.56.102 nix.gettext-0.19.8.1
echo -e "\n==== bash nix.sh install 192.168.56.103 nix.gettext-0.19.8.1" && bash nix.sh install 192.168.56.103 nix.gettext-0.19.8.1
echo -e "\n==== bash nix.sh install 192.168.56.101 nix.openjdk-8u172b11" && bash nix.sh install 192.168.56.101 nix.openjdk-8u172b11
echo -e "\n==== bash nix.sh install 192.168.56.102 nix.openjdk-8u172b11" && bash nix.sh install 192.168.56.102 nix.openjdk-8u172b11
echo -e "\n==== bash nix.sh install 192.168.56.103 nix.openjdk-8u172b11" && bash nix.sh install 192.168.56.103 nix.openjdk-8u172b11
第三步: 启动(zookeeper & kafka & schem-registry & ksql & kafka connect & postgres)远程服务
[larluo@larluo-nixos:~/my-env]$ cat run.sh.d/ksql-example/start.sh
my=$(cd -P -- "$(dirname -- "${BASH_SOURCE-$0}")" > /dev/null && pwd -P) && cd $my/../..
export ZK_ALL="192.168.56.101:2181,192.168.56.102:2181,192.168.56.103:2181"
# start confluent-oss-5.0.0:zookeeper
echo -e "\n==== bash nix.sh start 192.168.56.101:2181 confluent-oss-5.0.0:zookeeper --all ${ZK_ALL}"
bash nix.sh start 192.168.56.101:2181 confluent-oss-5.0.0:zookeeper --all ${ZK_ALL}
echo -e "\n==== bash nix.sh start 192.168.56.102:2181 confluent-oss-5.0.0:zookeeper --all ${ZK_ALL}"
bash nix.sh start 192.168.56.102:2181 confluent-oss-5.0.0:zookeeper --all ${ZK_ALL}
echo -e "\n==== bash nix.sh start 192.168.56.103:2181 confluent-oss-5.0.0:zookeeper --all ${ZK_ALL}"
bash nix.sh start 192.168.56.103:2181 confluent-oss-5.0.0:zookeeper --all ${ZK_ALL}
echo -e "\n==== sleep 10 " && sleep 10
# start confluent-oss-5.0.0:kafka
export KAFKA_ALL="192.168.56.101:9092,192.168.56.102:9092,192.168.56.103:9092"
echo -e "\n==== bash nix.sh start 192.168.56.101:9092 confluent-oss-5.0.0:kafka --zookeepers ${ZK_ALL} --cluster.id monitor"
bash nix.sh start 192.168.56.101:9092 confluent-oss-5.0.0:kafka --zookeepers ${ZK_ALL} --cluster.id monitor
echo -e "\n==== bash nix.sh start 192.168.56.102:9092 confluent-oss-5.0.0:kafka --zookeepers ${ZK_ALL} --cluster.id monitor"
bash nix.sh start 192.168.56.102:9092 confluent-oss-5.0.0:kafka --zookeepers ${ZK_ALL} --cluster.id monitor
echo -e "\n==== bash nix.sh start 192.168.56.103:9092 confluent-oss-5.0.0:kafka --zookeepers ${ZK_ALL} --cluster.id monitor"
bash nix.sh start 192.168.56.103:9092 confluent-oss-5.0.0:kafka --zookeepers ${ZK_ALL} --cluster.id monitor
echo -e "\n==== sleep 10 " && sleep 10
# start confluent-oss-5.0.0:schema-registry
echo -e "\n==== bash nix.sh start 192.168.56.101:8081 confluent-oss-5.0.0:schema-registry --kafkas ${KAFKA_ALL} --cluster.id monitor"
bash nix.sh start 192.168.56.101:8081 confluent-oss-5.0.0:schema-registry --kafkas ${KAFKA_ALL} --cluster.id monitor
echo -e "\n==== bash nix.sh start 192.168.56.102:8081 confluent-oss-5.0.0:schema-registry --kafkas ${KAFKA_ALL} --cluster.id monitor"
bash nix.sh start 192.168.56.102:8081 confluent-oss-5.0.0:schema-registry --kafkas ${KAFKA_ALL} --cluster.id monitor
echo -e "\n==== bash nix.sh start 192.168.56.103:8081 confluent-oss-5.0.0:schema-registry --kafkas ${KAFKA_ALL} --cluster.id monitor"
bash nix.sh start 192.168.56.103:8081 confluent-oss-5.0.0:schema-registry --kafkas ${KAFKA_ALL} --cluster.id monitor
echo -e "\n==== sleep 10 " && sleep 10
# start confluent-oss-5.0.0:ksql
echo -e "\n==== bash nix.sh start 192.168.56.101:8088 confluent-oss-5.0.0:ksql --kafkas ${KAFKA_ALL} --cluster.id monitor"
bash nix.sh start 192.168.56.101:8088 confluent-oss-5.0.0:ksql --kafkas ${KAFKA_ALL} --cluster.id monitor
echo -e "\n==== bash nix.sh start 192.168.56.102:8088 confluent-oss-5.0.0:ksql --kafkas ${KAFKA_ALL} --cluster.id monitor"
bash nix.sh start 192.168.56.102:8088 confluent-oss-5.0.0:ksql --kafkas ${KAFKA_ALL} --cluster.id monitor
echo -e "\n==== bash nix.sh start 192.168.56.103:8088 confluent-oss-5.0.0:ksql --kafkas ${KAFKA_ALL} --cluster.id monitor"
bash nix.sh start 192.168.56.103:8088 confluent-oss-5.0.0:ksql --kafkas ${KAFKA_ALL} --cluster.id monitor
# start confluent-oss-5.0.0:kafka-connect
echo -e "\n==== bash nix.sh start 192.168.56.101:8083 confluent-oss-5.0.0:kafka-connect --kafkas ${KAFKA_ALL} --cluster.id monitor"
bash nix.sh start 192.168.56.101:8083 confluent-oss-5.0.0:kafka-connect --kafkas ${KAFKA_ALL} --cluster.id monitor
echo -e "\n==== bash nix.sh start 192.168.56.102:8083 confluent-oss-5.0.0:kafka-connect --kafkas ${KAFKA_ALL} --cluster.id monitor"
bash nix.sh start 192.168.56.102:8083 confluent-oss-5.0.0:kafka-connect --kafkas ${KAFKA_ALL} --cluster.id monitor
echo -e "\n==== bash nix.sh start 192.168.56.103:8083 confluent-oss-5.0.0:kafka-connect --kafkas ${KAFKA_ALL} --cluster.id monitor"
bash nix.sh start 192.168.56.103:8083 confluent-oss-5.0.0:kafka-connect --kafkas ${KAFKA_ALL} --cluster.id monitor
# start postgresql-10.4
echo -e "\n==== bash nix.sh start 192.168.56.101:5432 postgresql-10.4" && bash nix.sh start 192.168.56.101:5432 postgresql-10.4