Elasticsearch plugin开发 之 自定义payload_score query

当需要将term的权重存储到索引中时,需要保存成payload的格式:

源代码:https://github.com/limingnihao/elasticsearch-reference/tree/master/Examples
官方文档:https://www.elastic.co/guide/en/elasticsearch/reference/7.10/analysis-delimited-payload-tokenfilter.html

类似于:

the|0 brown|3 fox|4 is|0 quick|10

查询的时候,如果需要用到保存好的value,则需要lucene 的PayloadScoreQuery或者PayloadCheckQuery。

PayloadScoreQuery:

首先查看下lucene的PayloadScoreQuery的构造方法:


  /**
   * Creates a new PayloadScoreQuery
   * @param wrappedQuery the query to wrap
   * @param function a PayloadFunction to use to modify the scores
   * @param decoder a PayloadDecoder to convert payloads into float values
   * @param includeSpanScore include both span score and payload score in the scoring algorithm
   */
  public PayloadScoreQuery(SpanQuery wrappedQuery, PayloadFunction function, PayloadDecoder decoder, boolean includeSpanScore) {
    this.wrappedQuery = Objects.requireNonNull(wrappedQuery);
    this.function = Objects.requireNonNull(function);
    this.decoder = Objects.requireNonNull(decoder);
    this.includeSpanScore = includeSpanScore;
  }

可以发现,需要构造4个参数:

  • SpanQuery wrappedQuery。进行召回的query,必须是spanQuery
  • PayloadFunction function。当命中多个term时,得分的计算规则,max、min、sum、
  • PayloadDecoder decoder。保存的value的解码方式。int或float类型
  • boolean includeSpanScore。是否使用保存的分数。

下面开始开发,需要构建2个类一个是plugin、一个是builder

PayloadScoreQParserPlugin

用于构造Builder的

public class PayloadScoreQParserPlugin extends Plugin implements SearchPlugin {

    @Override
    public List<QuerySpec<?>> getQueries() {
        return Collections.singletonList(
            new QuerySpec<>(PayloadScoreQueryBuilder.NAME, PayloadScoreQueryBuilder::new, PayloadScoreQueryBuilder::fromXContent)
        );
    }
}

PayloadScoreQueryBuilder

首先解析参数的fromXContent方法:

主要用于解析我们自定义的参数:query、func、calc(后续扩展权重交叉计算)、includeSpanScore

public static QueryBuilder fromXContent(XContentParser parser) throws IOException {
    String currentFieldName = null;
    XContentParser.Token token;
    QueryBuilder iqb = null;

    String func = null;
    String calc = null;
    boolean includeSpanScore = false;
    while ((token = parser.nextToken()) != XContentParser.Token.END_OBJECT) {
        if (token == XContentParser.Token.FIELD_NAME) {
            currentFieldName = parser.currentName();
        } else if (token == XContentParser.Token.START_OBJECT) {
            if (QUERY_FIELD.match(currentFieldName, parser.getDeprecationHandler())) {
                iqb = parseInnerQueryBuilder(parser);
            } else {
                throw new ParsingException(parser.getTokenLocation(),
                    "[" + NAME + "] query does not support [" + currentFieldName + "]");
            }
        } else if (token.isValue()) {
            if (FUNC_FIELD.match(currentFieldName, parser.getDeprecationHandler())) {
                func = parser.text();
            } else if (CALC_FIELD.match(currentFieldName, parser.getDeprecationHandler())) {
                calc = parser.text();
            } else if (INCLUDE_SPAN_SCORE_FIELD.match(currentFieldName, parser.getDeprecationHandler())) {
                includeSpanScore = parser.booleanValue();
            } else {
                throw new ParsingException(parser.getTokenLocation(),
                    "[" + NAME + "] query does not support [" + currentFieldName + "]");
            }
        }
    }
    return new PayloadScoreQueryBuilder(iqb, func, calc, includeSpanScore);
}

构造PayloadScoreQuery的doToQuery方法:

主要是将lucene的PayloadScoreQuery类需要的4个参数构造出来:

protected Query doToQuery(SearchExecutionContext context) throws IOException {
    // query  parse
    SpanQuery spanQuery = null;
    try {
        spanQuery = (SpanQuery) query.toQuery(context);
    } catch (IOException e) {
        throw new IllegalArgumentException(e);
    }

    if (spanQuery == null) {
        throw new IllegalArgumentException("SpanQuery is null");
    }

    PayloadFunction payloadFunction = PayloadUtils.getPayloadFunction(this.func);
    if (payloadFunction == null) {
        throw new IllegalArgumentException("Unknown payload function: " + func);
    }
    PayloadDecoder payloadDecoder = PayloadUtils.getPayloadDecoder("float");

    return new PayloadScoreQuery(spanQuery, payloadFunction, payloadDecoder, this.includeSpanScore);
}

PayloadScoreQueryBuilder完整代码

package org.elasticsearch.plugins.payload;

import org.apache.lucene.queries.payloads.PayloadDecoder;
import org.apache.lucene.queries.payloads.PayloadFunction;
import org.apache.lucene.queries.payloads.PayloadScoreQuery;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.spans.SpanQuery;
import org.elasticsearch.common.ParseField;
import org.elasticsearch.common.ParsingException;
import org.elasticsearch.common.io.stream.StreamInput;
import org.elasticsearch.common.io.stream.StreamOutput;
import org.elasticsearch.common.xcontent.XContentBuilder;
import org.elasticsearch.common.xcontent.XContentParser;
import org.elasticsearch.index.query.*;

import java.io.IOException;
import java.util.Objects;

public class PayloadScoreQueryBuilder extends AbstractQueryBuilder<PayloadScoreQueryBuilder> {
    public static final String NAME = "payload_score";

    private static final ParseField QUERY_FIELD = new ParseField("query");
    private static final ParseField FUNC_FIELD = new ParseField("func");
    private static final ParseField CALC_FIELD = new ParseField("calc");
    private static final ParseField INCLUDE_SPAN_SCORE_FIELD = new ParseField("includeSpanScore");

    private final QueryBuilder query;
    private final String func;
    private final String calc;
    private final boolean includeSpanScore;

    public PayloadScoreQueryBuilder(QueryBuilder query, String func, String calc, boolean includeSpanScore) {
        this.query = requireValue(query, "[" + NAME + "] requires '" + QUERY_FIELD.getPreferredName() + "' field");
        this.func = func;
        this.calc = calc;
        this.includeSpanScore = includeSpanScore;
    }

    public PayloadScoreQueryBuilder(StreamInput in) throws IOException {
        super(in);
        this.query = in.readNamedWriteable(QueryBuilder.class);
        this.func = in.readString();
        this.calc = in.readString();
        this.includeSpanScore = in.readBoolean();
    }

    @Override
    protected void doWriteTo(StreamOutput out) throws IOException {
        out.writeNamedWriteable(query);
        out.writeString(this.func);
        out.writeString(this.calc);
        out.writeBoolean(this.includeSpanScore);
    }

    @Override
    protected void doXContent(XContentBuilder builder, Params params) throws IOException {
        builder.startObject(NAME);
        builder.field(QUERY_FIELD.getPreferredName());
        query.toXContent(builder, params);

        builder.field(FUNC_FIELD.getPreferredName(), this.func);
        builder.field(CALC_FIELD.getPreferredName(), this.calc);
        builder.field(INCLUDE_SPAN_SCORE_FIELD.getPreferredName(), this.includeSpanScore);
        printBoostAndQueryName(builder);
        builder.endObject();
    }

    public static QueryBuilder fromXContent(XContentParser parser) throws IOException {
        String currentFieldName = null;
        XContentParser.Token token;
        QueryBuilder iqb = null;

        String func = null;
        String calc = null;
        boolean includeSpanScore = false;
        while ((token = parser.nextToken()) != XContentParser.Token.END_OBJECT) {
            if (token == XContentParser.Token.FIELD_NAME) {
                currentFieldName = parser.currentName();
            } else if (token == XContentParser.Token.START_OBJECT) {
                if (QUERY_FIELD.match(currentFieldName, parser.getDeprecationHandler())) {
                    iqb = parseInnerQueryBuilder(parser);
                } else {
                    throw new ParsingException(parser.getTokenLocation(),
                        "[" + NAME + "] query does not support [" + currentFieldName + "]");
                }
            } else if (token.isValue()) {
                if (FUNC_FIELD.match(currentFieldName, parser.getDeprecationHandler())) {
                    func = parser.text();
                } else if (CALC_FIELD.match(currentFieldName, parser.getDeprecationHandler())) {
                    calc = parser.text();
                } else if (INCLUDE_SPAN_SCORE_FIELD.match(currentFieldName, parser.getDeprecationHandler())) {
                    includeSpanScore = parser.booleanValue();
                } else {
                    throw new ParsingException(parser.getTokenLocation(),
                        "[" + NAME + "] query does not support [" + currentFieldName + "]");
                }
            }
        }
        return new PayloadScoreQueryBuilder(iqb, func, calc, includeSpanScore);
    }

    @Override
protected Query doToQuery(SearchExecutionContext context) throws IOException {
    // query  parse
    SpanQuery spanQuery = null;
    try {
        spanQuery = (SpanQuery) query.toQuery(context);
    } catch (IOException e) {
        throw new IllegalArgumentException(e);
    }

    if (spanQuery == null) {
        throw new IllegalArgumentException("SpanQuery is null");
    }

    PayloadFunction payloadFunction = PayloadUtils.getPayloadFunction(this.func);
    if (payloadFunction == null) {
        throw new IllegalArgumentException("Unknown payload function: " + func);
    }
    PayloadDecoder payloadDecoder = PayloadUtils.getPayloadDecoder("float");

    return new PayloadScoreQuery(spanQuery, payloadFunction, payloadDecoder, this.includeSpanScore);
}

    @Override
    protected boolean doEquals(PayloadScoreQueryBuilder that) {
        return Objects.equals(query, that.query)
            && Objects.equals(func, that.func)
            && Objects.equals(calc, that.calc)
            && Objects.equals(includeSpanScore, that.includeSpanScore);
    }

    @Override
    protected int doHashCode() {
        return Objects.hash(query, func, calc, includeSpanScore);
    }

    @Override
    public String getWriteableName() {
        return NAME;
    }

}

执行示例:

POST http://127.0.0.1:9200/position/_search
{
    "query": {
        "payload_score": {
            "func": "sum",
            "calc": "sum",
            "includeSpanScore": "false",
            "query": {
                "span_or": {
                    "clauses": [
                        {
                            "span_term": {
                                "FIELD": "test"
                            }
                        }
                    ]
                }
            }
        }
    }
}
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 212,383评论 6 493
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 90,522评论 3 385
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 157,852评论 0 348
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 56,621评论 1 284
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 65,741评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 49,929评论 1 290
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,076评论 3 410
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 37,803评论 0 268
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,265评论 1 303
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,582评论 2 327
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,716评论 1 341
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,395评论 4 333
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,039评论 3 316
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,798评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,027评论 1 266
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,488评论 2 361
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,612评论 2 350

推荐阅读更多精彩内容