Lucene源码分析 - queryparser > flexible

Flexible

This project contains the new Lucene query parser implementation, which matches the syntax of the core QueryParser but offers a more modular architecture to enable customization.

This project contains the new Lucene query parser implementation,
which matches the syntax
of the core QueryParser
but offers a more modular architecture
to enable customization.

包含新Lucene查询解析实现,匹配core QueryParser语法,在自定义方面,结构更加模块化。

It's currently divided in 2 main packages:
分布在2个包中:
org.apache.lucene.queryparser.flexible.core
包含query parser api类,可以被query parser 实现扩展。

org.apache.lucene.queryparser.flexible.standard
包new query parser api实现的Lucene query parser 实现。

{@link org.apache.lucene.queryparser.flexible.core}: it contains the query parser API classes, which should be extended by query parser implementations.

{@link org.apache.lucene.queryparser.flexible.standard}: it contains the current Lucene query parser implementation using the new query parser API.

Features

  1. Full support for boolean logic (not enabled)
    完全支持boolean 逻辑

  2. ueryNode Trees - support for several syntaxes, that can be converted into similar syntax QueryNode trees.
    支持几种语法,可以转换成相似的语法 QueryNode 树。

  3. QueryNode Processors - Optimize, validate, rewrite the QueryNode trees
    QueryNode Processors - 优化,验证,rewrite QueryNode 树。

  4. Processors Pipelines - Select your favorite Processor and build a processor pipeline, to implement the features you need
    Pipelines处理器,选择你最喜欢的处理器,构建一个pipeline处理器,来实现你需要的特性。

  5. Config Interfaces - Allow the consumer of the Query Parser to implement a diff Config Handler Objects to suite their needs.
    配置接口,允许Query Parser的consumer来实现一个diff 配置处理对象来

  6. Standard Builders - convert QueryNode's into several lucene representations. Supported conversion is using a 2.4 compatible logic
    独立的builders -

  7. QueryNode tree's can be converted to a lucene 2.4 syntax string, using toQueryString

Design

This new query parser was designed to have very generic architecture, so that it can be easily used for different products with varying query syntaxes. This code is much more flexible and extensible than the Lucene query parser in 2.4.X.

1、新的query parser 被设计有一种通用的结构
2、很容易被用在不同的产品上
3、比起2.4.x中的Lucene query parser 扩展性和灵活性更好

The new query parser goal is to separate syntax and semantics of a query. E.g. 'a AND b', '+a +b', 'AND(a,b)' could be different syntaxes for the same query. It distinguishes the semantics of the different query components, e.g. whether and how to tokenize/lemmatize/normalize the different terms or which Query objects to create for the terms. It allows to write a parser with a new syntax, while reusing the underlying semantics, as quickly as possible.

The query parser has three layers and its core is what we call the QueryNode tree. It is a tree that initially represents the syntax of the original query, e.g. for 'a AND b':

  AND
 /   \
A     B

The three layers are:

QueryParser

  • This layer is the text parsing layer which simply transforms the query text string into a {@link org.apache.lucene.queryparser.flexible.core.nodes.QueryNode} tree. Every text parser must implement the interface {@link org.apache.lucene.queryparser.flexible.core.parser.SyntaxParser}. Lucene default implementations implements it using JavaCC.

QueryNodeProcessor

  • The query node processors do most of the work. It is in fact a configurable chain of processors. Each processors can walk the tree and modify nodes or even the tree's structure. That makes it possible to e.g. do query optimization before the query is executed or to tokenize terms.

QueryBuilder:

  • The third layer is a configurable map of builders, which map {@link org.apache.lucene.queryparser.flexible.core.nodes.QueryNode} types to its specific builder that will transform the QueryNode into Lucene Query object.

Furthermore, the query parser uses flexible configuration objects. It also uses message classes that allow to attach resource bundles. This makes it possible to translate messages, which is an important feature of a query parser.

This design allows to develop different query syntaxes very quickly.

StandardQueryParser and QueryParserWrapper

The classic Lucene query parser is located under {@link org.apache.lucene.queryparser.classic}.

To make it simpler to use the new query parser the class {@link org.apache.lucene.queryparser.flexible.standard.StandardQueryParser} may be helpful, specially for people that do not want to extend the Query Parser. It uses the default Lucene query processors, text parser and builders, so you don't need to worry about dealing with those. {@link org.apache.lucene.queryparser.flexible.standard.StandardQueryParser} usage:

  StandardQueryParser qpHelper = new StandardQueryParser();
  StandardQueryConfigHandler config =  qpHelper.getQueryConfigHandler();
  config.setAllowLeadingWildcard(true);
  config.setAnalyzer(new WhitespaceAnalyzer());
  Query query = qpHelper.parse("apache AND lucene", "defaultField");
©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

  • rljs by sennchi Timeline of History Part One The Cognitiv...
    sennchi阅读 12,192评论 0 10
  • 天气开始变得寒冷起来,在这个逐渐熟悉的城市,有些记忆会使人温暖起来。有时候会想想曾经熟悉的人和事。比如:老家的...
    3个橘子阅读 2,773评论 0 0
  • 今天八月十五中秋节,月圆人圆家团圆,月到中秋分外明! 今年中秋,最开心的莫过于儿子回家过节,虽然回家的路上,有很多...
    惠芝阅读 3,382评论 0 1
  • 我去看这部电影原因有二:一是帅气的抖森先生,二是帅气的汤先生。抖森先生据说是位既绅士又蠢萌的学霸,汤先生是动作片拥...
    大月123阅读 4,314评论 0 2
  • 2017年国庆长假期间,有幸拜读了周国平老师论阅读《做大师的学生》,让我荣幸的站在大师肩膀上阅读,对读书的...
    走在飘香的麦埂上阅读 2,967评论 0 0

友情链接更多精彩内容