Elasticsearch 聚合查询
bucket:
概念:一个数据分组。按照某个字段就行bucket划分,那个字段的值相同的哪些数据,就会被划分到一个bucket中。
-
举例:
-
数据
city name 北京 小李 北京 小王 上海 小张 上海 小丽 上海 小陈 基于city划分bucket,划分出来两个bucket,一个是北京bucket,一个是上海bucket;北京bucket包含了2个人,小李和小王,上海bucket包含3个人小张、小李和小陈。
类比mysql:聚合的第一步就分组,对每个组内的数据进行聚合分析,分组。
-
metric
概念:对每个数据分组执行的统计。对一个bucket执行的某种聚合分析操作。比如平均值、最大值、最小值、求和等。
操作:当bucket之后,就可以对每个bucket中的数据进行聚合分词了。比如计算一个bucket内对所有数据的数量,或者计算一个bucket内所有数据的平均值、最大值和最小值等。
-
类比mysql:
select count(*) from access_log group by user_id;
- bucket: group by user_id -> 哪些user_id相同的数据,就会被划分到一个bucket中。
- metric:count(*),对每个user_id bucket中所有的数据,计算一个数量。
聚合语法
-
语法
GET /product/_search { "aggs": { "NAME": { #聚合名称 "AGG_TYPE": {} #聚合类型 } } }
-
聚合样例: 每个tag产品的数量
# size = 0 表示不显示元数据 GET /product/_search { "aggs": { "aggs_tag_group": { "terms": { "field": "tags.keyword" } } } , "size": 0 } # 聚合结果数据 { "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 5, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : { "aggs_tag_group" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 1, "buckets" : [ { "key" : "fashao", "doc_count" : 3 }, { "key" : "xingjiabi", "doc_count" : 3 }, { "key" : "bufangshui", "doc_count" : 1 }, { "key" : "buka", "doc_count" : 1 }, { "key" : "gongjiaoka", "doc_count" : 1 }, { "key" : "low", "doc_count" : 1 }, { "key" : "lowbee", "doc_count" : 1 }, { "key" : "menjinka", "doc_count" : 1 }, { "key" : "xuhangduan", "doc_count" : 1 }, { "key" : "yinzhicha", "doc_count" : 1 } ] } } }
-
价格大于1999的每个tag产品的数量
GET /product/_search { "query": { "bool": { "filter": [ {"range": { "price": { "gt": 1999 } }} ] } }, "aggs": { "aggs_tag_group": { "terms": { "field": "tags.keyword" } } } , "size": 0 } #聚合结果 { "took" : 0, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 3, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : { "aggs_tag_group" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : "fashao", "doc_count" : 3 }, { "key" : "xingjiabi", "doc_count" : 3 }, { "key" : "buka", "doc_count" : 1 }, { "key" : "gongjiaoka", "doc_count" : 1 }, { "key" : "menjinka", "doc_count" : 1 } ] } } }
-
平均价格
#每个tag产品的平均价格 GET /product/_search { "aggs": { "tag_agg_avg": { "terms": { "field": "tags.keyword" , "order": { "avg_price": "desc" } } , "aggs": { "avg_price": { "avg": { "field": "price" } } } } } , "size": 0 } #聚合结果 { "took" : 3, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 5, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : { "tag_agg_avg" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 1, "buckets" : [ { "key" : "gongjiaoka", "doc_count" : 1, "avg_price" : { "value" : 4999.0 } }, { "key" : "buka", "doc_count" : 1, "avg_price" : { "value" : 3999.0 } }, { "key" : "fashao", "doc_count" : 3, "avg_price" : { "value" : 3999.0 } }, { "key" : "xingjiabi", "doc_count" : 3, "avg_price" : { "value" : 3999.0 } }, { "key" : "menjinka", "doc_count" : 1, "avg_price" : { "value" : 2999.0 } }, { "key" : "bufangshui", "doc_count" : 1, "avg_price" : { "value" : 999.0 } }, { "key" : "low", "doc_count" : 1, "avg_price" : { "value" : 999.0 } }, { "key" : "yinzhicha", "doc_count" : 1, "avg_price" : { "value" : 999.0 } }, { "key" : "lowbee", "doc_count" : 1, "avg_price" : { "value" : 399.0 } }, { "key" : "xuhangduan", "doc_count" : 1, "avg_price" : { "value" : 399.0 } } ] } } } ## 价格大于1999的每个tag产品的平均价格 GET /product/_search { "query": { "bool": { "filter": [ {"range": { "price": { "gt": 1999 } }} ] } }, "aggs": { "tag_agg_avg": { "terms": { "field": "tags.keyword" , "order": { "avg_price": "desc" } } , "aggs": { "avg_price": { "avg": { "field": "price" } } } } } , "size": 0 }
-
练习:#按照千元机 1000以下 中端机[2000-3000) 高端机 [3000,∞)
GET /product/_search { "aggs": { "tag_agg_group": { "range": { "field": "price", "ranges": [ { "from": 0, "to": 1000 }, { "from": 1000, "to": 3000 }, { "from": 3000 } ] } } }, "size": 0 } #聚合结果 { "took" : 0, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 5, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "aggregations" : { "tag_agg_group" : { "buckets" : [ { "key" : "0.0-1000.0", "from" : 0.0, "to" : 1000.0, "doc_count" : 2 }, { "key" : "1000.0-3000.0", "from" : 1000.0, "to" : 3000.0, "doc_count" : 1 }, { "key" : "3000.0-*", "from" : 3000.0, "doc_count" : 2 } ] } } }