ES5.4.2版本中nested空文档会导致获取inner_hits时index_out_of_bounds_exception异常
在排查这个问题的过程中顺便了解了nested的索引过程及查询原理
问题复现
mapping
{
"mappings": {
"test": {
"dynamic": false,
"_source": {"includes":["data.f2.f3"]},
"properties": {
"data": {
"type": "nested",
"properties": {
"f1": {"type": "keyword"},
"f2": {
"type": "nested",
"properties": {
"f3": {"type": "keyword"}
}
}
}
}
}
}
}
}
index data
{
"data":[
{
"f1":"tanghuan",
"f2":[
{
"f5":"tanghuan3"
}
]
},
{
"f1":"tanghuan",
"f2":[
{
"f3":"tanghuan4"
}
]
}
]
}
search
/_search?pretty -d '{"query":{"nested":{"path":"data", "query":{"match_all":{}}, "inner_hits":{}}}}
response
{
"took" : 23,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 4,
"failed" : 1,
"failures" : [
{
"shard" : 1,
"index" : "test",
"node" : "iq9Wp2asSN6ISlma2u8BmA",
"reason" : {
"type" : "index_out_of_bounds_exception",
"reason" : "Index: 1, Size: 1"
}
}
]
},
"hits" : {
"total" : 1,
"max_score" : 1.0,
"hits" : [ ]
}
}
索引过程
IndexResult org.elasticsearch.action.bulk.TransportShardBulkAction.executeIndexRequestOnPrimary( IndexRequestrequest, IndexShard primary, MappingUpdatedAction mappingUpdatedAction) throws Exception
在此函数里index request被解析成lucene可以识别的doc,即可调用lucene的index document api来索引文档
如上述例子,解析后共5个doc,分别为
doc1
[indexed,omitNorms,indexOptions=DOCS<_uid:test#225>,
indexed,omitNorms,indexOptions=DOCS<_type:__data.f2>,
indexed,omitNorms,indexOptions=DOCS<data.f2.f3:[74 61 6e 67 68 75 61 6e 34]>, docValuesType=SORTED_SET<data.f2.f3:[74 61 6e 67 68 75 61 6e 34]>,
indexed,omitNorms,indexOptions=DOCS<_field_names:_uid>, indexed,omitNorms,indexOptions=DOCS<_field_names:_type>,
indexed,omitNorms,indexOptions=DOCS<_field_names:data>, indexed,omitNorms,indexOptions=DOCS<_field_names:data.f2>,
indexed,omitNorms,indexOptions=DOCS<_field_names:data.f2.f3>,
indexed,omitNorms,indexOptions=DOCS<_field_names:data>,
indexed,omitNorms,indexOptions=DOCS<_field_names:data.f2>,
indexed,omitNorms,indexOptions=DOCS<_field_names:data.f2.f3>,
docValuesType=NUMERIC<_version:1>]
doc2
[indexed,omitNorms,indexOptions=DOCS<_uid:test#225>,
indexed,omitNorms,indexOptions=DOCS<_type:__data>,
indexed,omitNorms,indexOptions=DOCS<data.f1:[74 61 6e 67 68 75 61 6e]>,
docValuesType=SORTED_SET<data.f1:[74 61 6e 67 68 75 61 6e]>,
indexed,omitNorms,indexOptions=DOCS<_field_names:_uid>,
indexed,omitNorms,indexOptions=DOCS<_field_names:_type>,
indexed,omitNorms,indexOptions=DOCS<_field_names:data>,
indexed,omitNorms,indexOptions=DOCS<_field_names:data.f1>,
indexed,omitNorms,indexOptions=DOCS<_field_names:data>,
indexed,omitNorms,indexOptions=DOCS<_field_names:data.f1>,
docValuesType=NUMERIC<_version:1>]
doc3
[indexed,omitNorms,indexOptions=DOCS<_uid:test#225>,
indexed,omitNorms,indexOptions=DOCS<_type:__data.f2>,
indexed,omitNorms,indexOptions=DOCS<_field_names:_uid>,
indexed,omitNorms,indexOptions=DOCS<_field_names:_type>,
docValuesType=NUMERIC<_version:1>]
doc4
[indexed,omitNorms,indexOptions=DOCS<_uid:test#225>,
indexed,omitNorms,indexOptions=DOCS<_type:__data>,
indexed,omitNorms,indexOptions=DOCS<data.f1:[74 61 6e 67 68 75 61 6e]>,
docValuesType=SORTED_SET<data.f1:[74 61 6e 67 68 75 61 6e]>,
indexed,omitNorms,indexOptions=DOCS<_field_names:_uid>,
indexed,omitNorms,indexOptions=DOCS<_field_names:_type>,
indexed,omitNorms,indexOptions=DOCS<_field_names:data>,
indexed,omitNorms,indexOptions=DOCS<_field_names:data.f1>,
indexed,omitNorms,indexOptions=DOCS<_field_names:data>,
indexed,omitNorms,indexOptions=DOCS<_field_names:data.f1>,
docValuesType=NUMERIC<_version:1>]
doc5
[stored<_source:[7b 22 64 61 74 61 22 3a 5b 7b 22 66 32 22 3a 5b 7b 22 66 33 22 3a 22 74 61 6e 67 68 75 61 6e 34 22 7d 5d 7d 5d 7d]>,
indexed,omitNorms,indexOptions=DOCS<_type:test>,
docValuesType=SORTED_SET<_type:[74 65 73 74]>,
stored,indexed,omitNorms,indexOptions=DOCS<_uid:test#225>,
docValuesType=NUMERIC<_version:1>,
indexed,tokenized<_all:tanghuan>,
indexed,tokenized<_all:tanghuan>,
indexed,tokenized<_all:tanghuan4>,
indexed,omitNorms,indexOptions=DOCS<_field_names:_source>,
indexed,omitNorms,indexOptions=DOCS<_field_names:_type>,
indexed,omitNorms,indexOptions=DOCS<_field_names:_type>,
indexed,omitNorms,indexOptions=DOCS<_field_names:_uid>,
indexed,omitNorms,indexOptions=DOCS<_field_names:_version>,
indexed,omitNorms,indexOptions=DOCS<_field_names:_all>,
indexed,omitNorms,indexOptions=DOCS<_field_names:_all>,
indexed,omitNorms,indexOptions=DOCS<_field_names:_all>]
要明白每个doc里有哪些字段,每个字段表示什么含义,就得了解lucene索引字段的api,举个例子:indexed, omitNorms, indexOptions=DOCS<data.f1:[74 61 6e 67 68 75 61 6e]> 表示indexed(倒排),omitNorms(忽略norms信息),indexOptions=DOCS(可选的还有
DOCS_AND_FREQS_AND_POSITIONS等, 参考index_options),data.f1(字段名),[74 61 6e 67 68 75 61 6e] (content in bytes);再举个例子:docValuesType=SORTED_SET<_type:[74 65 73 74]> 表示docValuesType=SORTED_SET,_type(字段名),[74 65 73 74] (content in bytes)
注意ES生产这些doc的顺序,lucene也是按照这个顺序索引各个doc的字段;每个doc都索引了一个字段_uid:test#225,表示这些doc都属于我们指定ES创建的type=test, id=225的文档;同时注意doc3索引的字段_type:__data.f2与doc4索引的字段_type:__data与doc5索引的字段_type:test,这些将是nested查询的关键
查询过程
{"query":{"nested":{"inner_hits":{}, "path":"data", "query":{"term":{"data.f1":"tanghuan"}}}}}
conjuctionDISI
- ConstantScore(_type:test)
- ToParentBlockJoinQuery (data.f1: tanghuan)
- childQuery (data.f1:tanghuan)
- parentFilter (_type: [^_]*)
首先由条件_type:test得到第一个文档,即doc5;再获取以doc5为target的满足第二个条件的文档,即ToParentBlockJoinQuery;在ToParentBlockJoinQuery中将拉出childQuery(data.f1:tanghuan)对应的倒排链,并判断docid是否处于doc5及其上一个type等于test的docid之间
public int advance(int target) throws IOException { // target = 4; 即doc5
if (target >= parentBits.length()) {
return doc = NO_MORE_DOCS;
}
// parentBits.prevSetBit(target - 1)即-1,上一个type等于test的docid
final int firstChildTarget = target == 0 ? 0 : parentBits.prevSetBit(target - 1) + 1;
int childDoc = childApproximation.docID();
if (childDoc < firstChildTarget) {
// firstChildTarget = 0, 从0开始查找, childDoc = 3即doc4
childDoc = childApproximation.advance(firstChildTarget);
}
if (childDoc >= parentBits.length() - 1) {
return doc = NO_MORE_DOCS;
}
return doc = parentBits.nextSetBit(childDoc + 1); // 4, doc5
}
即得到doc5
fetch inner_hits过程
如上述例子,首先构造searchHit,即doc4,source来源于doc5中的_source字段
InnerHitsFetchSubPhase的fetch过程
void org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitExecute( SearchContextcontext, HitContext hitContext)
- 查找_type:__data的文档,得到doc2, doc4,root为doc5
- 获取doc5的source
- 获取nestedObjectMapper {data: f1, f2}
- 由XContentHelper解析source得到data=[{f2=[{f3=tanghuan4}]}],此时的source已经因为mapping设置了source.includes导致缺失了第一份nested document信息
- 从source中获取doc2, doc4相关的inner_hits,此时抛出异常
修复方案
造成异常的原因本质为设置了source.includes后,整个source只保留了第二份nested source,尝试修改ES源码,保留空的nested source
ES5.6.2版本已修复该问题,https://github.com/elastic/elasticsearch/issues/25315