为什么选择InfluxDB
1. 安装配置简单(Golang编写)
2. 原生HTTP接口
3. 类SQL的查询语言
安装
wget https://dl.influxdata.com/influxdb/releases/influxdb-0.13.0.x86_64.rpm
sudo yum localinstall influxdb-0.13.0.x86_64.rpm
查询、插入、删除数据
命令行形式
select * from cpu where time > now() - 1h group by time(10m)
insert cpu,host=web1 value=0.9 [timestamp] (host为tag,value为field,后面会介绍到区别)
delete from cpu where time < now() - 1h (性能上会有影响,一般不建议使用,用RP自动删除旧数据。RP后面介绍到)
HTTP APIs
/ping
$ curl -sl -I localhost:8086/ping
HTTP/1.1 204 No Content
Request-Id: 7d641f0b-e23b-11e5-8005-000000000000
X-Influxdb-Version: 1.0.x
Date: Fri, 04 Mar 2016 19:01:23 GMT
/query
$ curl -GET 'http://localhost:8086/query?db=mydb&pretty=true' --data-urlencode 'q=SELECT * FROM "mymeas"'
{
"results": [
{
"series": [
{
"name": "mymeas",
"columns": [
"time",
"myfield",
"mytag1",
"mytag2"
],
"values": [
[
"2016-05-20T21:30:00Z",
12,
"1",
null
],
[
"2016-05-20T21:30:20Z",
11,
"2",
null
],
[
"2016-05-20T21:30:40Z",
18,
null,
"1"
],
[
"2016-05-20T21:31:00Z",
19,
null,
"3"
]
]
}
]
}
]
}
/write
$ curl -i -XPOST "http://localhost:8086/write?db=mydb&rp=myrp" --data-binary 'mymeas,mytag=1 myfield=90'
术语和概念
InfluxDB | MySQL |
---|---|
database | database |
measurement | table |
point | row |
tag | index |
field | column |
retention policy | -- |
series | -- |
field 和 tag 的区别
1. field无索引,类型可以是 str, float, int, bool
2. tag有索引,类型只可为 str
Retention Policy 和 Continuous Query
Retention Policy(RP): The part of InfluxDB’s data structure that describes for how long InfluxDB keeps data (duration), how many copies of those data are stored in the cluster (replication factor), and the time range covered by shard groups (shard group duration). RPs are unique per database and along with the measurement and tag set define a series.
简单来说,RP就是一个定期删除旧数据的组件。像RRD,Graphite中类似的功能
Continuous Query: An InfluxQL query that runs automatically and periodically within a database.
CQ在InfluxDB中是一个很有特色的功能,有些场景下原始数据是秒级别,但是查询时只需要分钟级别的聚合值,如果每次查询秒级别的数据再聚合成分钟级别的数据,就显得有些多余,查询不高效。CQ的作用就是定时跑一个InfluxQL,预先计算好数据存入另一个地方,数据的查询只需查计算后的表,查询响应时间和存储空间上有很大的优化。
series
The collection of data in InfluxDB’s data structure that share a measurement, tag set, and retention policy.
一个database中的series值为 (measurement) x (tag set) x (reteion policy)。
比如一个database中有一个measurement,叫cpu_load,有两个RP(one_minute, five_minute),tag有host,server。host的值有hostA、hostB,server为server1,server2。那么这个database的series值为1 x 2 x 2 x 2 = 8
series的数量对于InfluxDB的性能有很大的影响,series数量越多对CPU、内存资源的要求越大。所以存储数据时,需要注意下哪些数据可以存为 field ,哪些数据必须是 tag。
TICK Stack
InfluxData公司还推出了基于InfluxDB为存储后端的其它组件,Telegraf是一个类似Flume、Heka的收集器,Chronograf是一个类似Grafana的前端展示。Kapacitor比较特殊,既是报警检测的组件,也是一个ETL组件,可以代替CQ的功能,减少大量CQ的执行,对于InfluxDB本身带来的负担。