利用Apache Drill查询MongoDB(一)单机部署
1.Drill是什么
Drill is an Apache open-source SQL query engine for Big Data exploration. Drill is designed from the ground up to support high-performance analysis on the semi-structured and rapidly evolving data coming from modern Big Data applications, while still providing the familiarity and ecosystem of ANSI SQL, the industry-standard query language. Drill provides plug-and-play integration with existing Apache Hive and Apache HBase deployments.
通俗的翻译下:Drill是Apache 的开源SQL查询引擎,专门用于大数据场景。从设计之初就考虑了半结构化数据查询的高性能,提供了与标准SQL查询语言相似的查询方式,可以和Apache Hive and Apache HBase无缝集成。
Drill只支持MongoDB CRUD中的R,其他都不支持
2.Drill单机部署
测试server:CentOS 6.5 x86_64
先决条件:Oracle JDK
-
安装Oracle JDK
下载截止目前最新版JDK:jdk-8u101-linux-x64.rpm
安装
sudo rpm -ivh jdk-8u101-linux-x64.rpm
查看已安装的JDK
sudo update-alternatives --config java
如果要使用OpenJDK,选择相应的数字即可。 -
安装和启动Drill
下载截止目前最新版Drill:apache-drill-1.8.0.tar.gz
然后执行以下命令
tar -zxvf apache-drill-1.8.0.tar.gz
cd apache-drill-1.8.0
bin/drill-embedded
如果一切正常会弹出Drill的提示符:
不过也可能报错:
UnknownHostException错误:需要修改/etc/hosts文件,设置hostname和IP的对应关系。
3.测试数据和Drill mongo插件配置
- 准备两个测试db
test数据库插入测试数据
db.categories.insert({_id: "MongoDB", parent: "Databases" } )
db.categories.insert({_id: "dbm", parent: "Databases" } )
db.categories.insert({_id: "Databases", parent: "Programming" } )
db.categories.insert({_id: "Languages", parent: "Programming" } )
db.categories.insert( { _id: "Programming", parent: "Books" } )
db.categories.insert( { _id: "Books", parent: null } )
car数据库插入测试数据
db.car.insert({deviceId:'111111',fuelConsum:5,mileage:1000,createTime:"2011-10-29 0:0:1"})
db.car.insert({deviceId:'222222',fuelConsum:6,mileage:2000,createTime:"2012-10-29 0:0:1"})
db.car.insert({deviceId:'333333',fuelConsum:7,mileage:3000,createTime:"2013-10-29 0:0:1"})
db.car.insert({deviceId:'444444',fuelConsum:8,mileage:4000,createTime:"2014-10-29 0:0:1"})
db.car.insert({deviceId:'555555',fuelConsum:9,mileage:5000,createTime:"2015-10-29 0:0:1"})
db.car.insert({deviceId:'666666',fuelConsum:10,mileage:6000,createTime:"2016-10-29 0:0:1"})
- 配置Drill的Mongo存储插件
打开Drill Web Console,url:http://172.19.3.131:8047/, IP地址需相应更改
Storage标签下找到mongo,配置如下
{
"type": "mongo",
"connection": "mongodb://172.19.3.132:27017,172.19.3.132:27017,172.19.3.132:27017/",
"enabled": true
}
4.hosts配置
sudo vim /etc/hosts
输入:
172.19.3.131 drill
172.19.3.132 mongodb_node01
172.19.3.133 mongodb_node02
172.19.3.134 mongodb_node03
5.查询测试
Drill的客户端有多种:shell、web console、jdbc/odbc,几种方式都可以执行SQL查询,实际测试shell和web比较好用,但是本人测试jdbc并不好用,也许Drill以后会完善
-
查询数据库列表
-
查询mongo.test数据库
-
查询mongo.car数据库的car集合中2012-2015年的数据
- 查询mongo.car数据库,获取福特汽车的油耗和公里数
参考文档
Drill官方文档
Drill官方文档中文翻译
SQL on MongoDB