前言
使用AggregationBuilders.terms("name").field("field")
做分组统计:
@Test
public void getData() {
// 根据action分组统计总数
AggregationBuilder actionAggregation = AggregationBuilders.terms("actionCount").field("action");
// cardinality相当于去重计算
CardinalityAggregationBuilder uniqueUser = AggregationBuilders.cardinality("uniqueUser").field("userid");
actionAggregation.subAggregation(uniqueUser);
// 创建查询请求实例
SearchRequest searchRequest = new SearchRequest("test-index");
searchRequest.source(SearchSourceBuilder.searchSource().aggregation(actionAggregation));
try {
// 使用RestHighLevelClient发送查询请求
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
long totalHits = searchResponse.getHits().totalHits;
log.info("从es命中的总数: {}", totalHits);
Terms actionTerms = searchResponse.getAggregations().get("actionCount");
for (Terms.Bucket bucket : actionTerms.getBuckets()) {
String action = bucket.getKeyAsString();
log.info("bucket key: {}", action);
long docCount = bucket.getDocCount();
log.info("bucket doc count: {}", docCount);
Cardinality uniqueUser1 = bucket.getAggregations().get("uniqueUser");
long value = uniqueUser1.getValue();
log.info("distinct doc count: {}", value);
}
} catch (IOException e) {
e.printStackTrace();
log.error("从es统计数据失败: {}", e.getMessage());
}
}
报错信息如下
ElasticsearchStatusException[Elasticsearch exception [type=search_phase_execution_exception, reason=all shards failed]
]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Fielddata is disabled on text fields by default. Set fielddata=true on [deviceid] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.]]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Fielddata is disabled on text fields by default. Set fielddata=true on [userid] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.]];
at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:177)
at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:2053)
at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:2030)
at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1777)
at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1734)
at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1696)
at org.elasticsearch.client.RestHighLevelClient.search(RestHighLevelClient.java:1092)
Caused by: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Fielddata is disabled on text fields by default. Set fielddata=true on [deviceid] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.]]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Fielddata is disabled on text fields by default. Set fielddata=true on [deviceid] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.]];
at org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:509)
at org.elasticsearch.ElasticsearchException.fromXContent(ElasticsearchException.java:420)
at org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:450)
at org.elasticsearch.ElasticsearchException.failureFromXContent(ElasticsearchException.java:616)
at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:169)
从报错信息Caused by: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Fielddata is disabled on text fields by default. Set fielddata=true on [userid] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead.]];
中可以看出:在分组统计时,默认情况下text类型的Fielddata是被禁用的,可以在字段上面设置fielddata = true
,通过取消倒排索引把fielddata加载到内存,但是这个可能会使用大量的内存,应该使用keyword类型来替代。
简单的说, 在elasticsearch中对text类型的字段进行聚合、排序时,会产生上面的错误
解决方法一:把要聚合的字段加上".keyword"就可以了
我们可以用fieldname.keyword
进行聚合,排序,代码修改为如下:
// 去重聚合
AggregationBuilders.cardinality("uniqueUser").field("userid.keyword");
解决方法二:设置字段的"fielddata=true"(不推荐,这样可能会使用大量的内存)
使用如下的命令设置:
PUT /test-index/_mapping/doc
{
"properties": {
"userid": {
"type": "text",
"fielddata": true
}
}
}
设置成功返回:
{
"acknowledged" : true
}