传统大数据采集一般通过flume采集nginx的log来实现,然后再经过kafka传递数据
有了ngx_kafak_module 数据采集就能通过nginx直接向kafka发送数据(用户行为日志)多逛逛全球最大的同性交友网站还是能学到很多东西滴~
nginx-kafka安装脚本
注意CentOS/Ubuntu安装依赖库时的区别
install-nginx-kafka.sh
#!/bin/bash
# centos
#yum update; yum install -y gcc gcc-c++ pcre-devel zlib-devel make git wget curl vim
#ubuntu
apt-get update; apt-get install -y gcc g++ libpcre3 libpcre3-dev zlib1g-dev libssl-dev make git wget curl vim
cd /tmp
git clone https://github.com/edenhill/librdkafka
git clone https://github.com/brg-liuwei/ngx_kafka_module
wget http://nginx.org/download/nginx-1.15.5.tar.gz
cd /tmp/librdkafka
./configure; make; sudo make install
tar -zxvf nginx-1.15.5.tar.gz
cd /tmp/nginx-1.15.5
./configure --prefix=/usr/local/nginx_kafka --add-module=/tmp/ngx_kafka_module; make; sudo make install
sudo ln -s /usr/local/nginx_kafka/sbin/nginx /usr/local/bin/nginx-kafka
sudo echo "/usr/local/lib" >> /etc/ld.so.conf
sudo ldconfig
- 更新软件源 & 安装依赖库、软件
- 下载librdkafka、ngx_kafka_module、nginx源码
- 编译安装librdkafka
- 解压nginx源码 & 带上ngx_kafka_module编译安装
- 为了方便,制作nginx-kafka软链(不与其他nginx冲突)
-
如果启动nginx报错,找不到kafka.so.1的文件
error while loading shared libraries: librdkafka.so.1: cannot open shared object file: No such file or directory
-
加载so库
echo "/usr/local/lib" >> /etc/ld.so.conf; ldconfig
nginx-kafka.conf
#user nobody;
worker_processes 1;
#error_log logs/error.log;
#error_log logs/error.log notice;
#error_log logs/error.log info;
#pid logs/nginx.pid;
events {
worker_connections 1024;
}
http {
include mime.types;
default_type application/octet-stream;
#log_format main '$remote_addr - $remote_user [$time_local] "$request" '
# '$status $body_bytes_sent "$http_referer" '
# '"$http_user_agent" "$http_x_forwarded_for"';
#access_log logs/access.log main;
sendfile on;
#tcp_nopush on;
#keepalive_timeout 0;
keepalive_timeout 65;
#gzip on;
kafka;
kafka_broker_list kafka-1:9092 kafka-2:9092 kafka-3:9092;
server {
listen 80;
server_name localhost;
#charset koi8-r;
#access_log logs/host.access.log main;
location = /kafka/log {
kafka_topic log;
}
location = /kafka/user {
kafka_topic user;
}
#error_page 404 /404.html;
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
}
}
- 指定kafka集群kafka_broker_list ip | host:port;
- location 可以根据topic划分URL
启动nginx
启动zookeeper集群和kafka集群(创建topic)
略。。。测试配置文件
nginx-kafka -c nginx-kafka.conf -t
启动nginx-kafka
nginx-kafka -c nginx-kafka.conf -s reload
enjoy 。