DELETE school
PUT /school
{
"mappings": {
"student": {
"properties": {
"age": {
"type": "long"
},
"course": {
"type": "keyword"
},
"name": {
"type": "text"
},
"study_date": {
"type": "date"
},
"mark": {
"type": "text"
}
}
}
}
}
POST /school/student/_bulk
{"index":{"_id":"1"}}
{"name":"zhangsan","age":25,"course":"elasticsearch","study_date":"2017-06-15T20:30:50","mark":"today is a good day"}
{"index":{"_id":"2"}}
{"name":"lisi","age":25,"course":"spring","study_date":"2017-06-16T20:30:50","mark":"very good"}
{"index":{"_id":"3"}}
{"name":"wangwu","age":27,"course":"elasticsearch","study_date":"2017-06-17T20:30:50","mark":"sorry day"}
{"index":{"_id":"4"}}
{"name":"zhaoliu","age":28,"course":"elasticsearch","study_date":"2017-06-18T20:30:50","mark":"happy day"}
{"index":{"_id":"5"}}
{"name":"sunqi","age":29,"course":"elasticsearch","study_date":"2017-06-19T20:30:50","mark":"good happy day"}
{"index":{"_id":"6"}}
{"name":"zhouba","age":30,"course":"elasticsearch","study_date":"2017-06-20T20:30:50","mark":"take care day"}
{"index":{"_id":"7"}}
{"name":"wujiu","age":31,"course":"spring elasticsearch","study_date":"2017-06-21T20:30:50","mark":"tab ok"}
{"index":{"_id":"8"}}
{"name":"zhengshi","age":32,"course":"elasticsearch","study_date":"2017-06-21T20:30:50"}
{"index":{"_id":"9"}}
{"name":"aaabbb","age":25,"course":"elasticsearch","study_date":"2017-06-15T20:30:50","mark":"zhangsan today is a good day"}
{"index":{"_id":"10"}}
{"name":"ababab","age":25,"course":"good","study_date":"2017-06-15T20:30:50","mark":"zhangsan say elasticsearch very good"}
URL querystring语法
全文检索:
GET /school/_search?q=zhangsan
单字段全文检索:
GET /school/_search?q=name:zhangsan
条件组合
GET /school/_search?q=+mark:zhangsan -name:aaabbb
单字段精确检索:
GET /school/_search?q=mark:"good day"
多个检索条件的组合:
GET /school/_search?q=name:("zhangsan" OR "lisi") AND NOT course:spring
字段是否存在:
GET /school/_search?q=_exists_:mark
GET /school/_search?q=NOT _exists_:mark
通配符:
用 ? 表示单字母,* 表示任意个字母
GET /school/_search?q=name:zh???san
GET /school/_search?q=name:zh*san
近似搜索:
用 ~ 表示搜索单词可能有一两个字母写的不对,按照相似度返回结果,最多可以模糊2个距离
GET /school/_search?q=name:zhangsnn~
指定近似搜素错误字符
GET /school/_search?q=name:zhangsxx~1
临近搜素
GET /school/_search?q=mark:"today good"~2
范围搜索:对数值和时间,都可以使用范围搜索
[] 表示端点数值包含在范围内,{} 表示端点数值不包含在范围内;
例如:age:>30,date:["now-6h" TO "now"}
等。
GET /school/_search?q=age:>30
GET /school/_search?q=age:[28 TO 30]
正则搜索:
(ES 中正则性能不高,尽量不要使用)
保留字符:. ? + * | { } [ ] ( ) " \ # @ & < > ~
转义字符用\,例如:\* \\
用.
代表一个字符,类似于通配符?
GET /school/_search?q=name:/zh...san/
GET /school/_search?q=name:/zha..s.n/
用.*
匹配多个,类似于通配符*
GET /school/_search?q=name:/zh.*san/
用*匹配0次或多次
GET /school/_search?q=name:/a*b*/
用?匹配0次或1次
GET /school/_search?q=name:/aaa?bbb?/
下面不能匹配
GET /school/_search?q=name:/aa?bb?/
用{}表示匹配的次数,格式:{至少次数,至多次数}
GET /school/_search?q=name:/a{3}b{3}/
GET /school/_search?q=name:/a{2,4}b{2,4}/
下面不能匹配
GET /school/_search?q=name:/a{4}b{4}/
用()组
GET /school/_search?q=name:/(ab)*/
GET /school/_search?q=name:/(ab){3}/
用|代表或
GET /school/_search?q=name:/(ab){3}|aaabbb/
用[]表示可选字符,用^代表否定
GET /school/_search?q=name:/[ab]*/
GET /school/_search?q=name:/[a-c]*/
GET /school/_search?q=name:/[^ab]*/
Query DSL 完整语法
空查询,默认查询所有的文档
GET school/student/_search
{
"query": {
"match_all": {}
}
}
不匹配任何文档
GET school/student/_search
{
"query": {
"match_none": {}
}
}
match 查询
match查询执行步骤:
- 检查字段类型;
- 分析查询字符串;
- 查找匹配文档;
- 为每个文档评分。
GET school/student/_search
{
"query": {
"match": {
"mark": "day"
}
}
}
多词match 查询,默认是或关系,有其中一个词即可
GET school/student/_search
{
"query": {
"match": {
"mark": "good day"
}
}
}
等同于
GET school/student/_search
{
"query": {
"match": {
"mark": {
"query":"good day",
"operator":"or"
}
}
}
}
控制匹配的项目个数
GET school/student/_search
{
"query": {
"match": {
"mark": {
"query":"good happy day ",
"minimum_should_match": "2"
}
}
}
}
与关系,必须包含所有词
等同于:GET /school/_search?q=mark:good AND mark:day
GET school/student/_search
{
"query": {
"match": {
"mark": {
"query":"good day",
"operator":"and"
}
}
}
}
短语匹配(match_phrase):
等同于GET /school/_search?q=mark:"good day"
查询的结果是必须挨着的短语,默认slop=0
执行步骤:
- 分析查询字符串,分解成词项;
- 查找匹配文档;
- 只保留包含全部词项的文档,并且词项位置也相同;
- slop指定词项间隔的范围。
GET school/student/_search
{
"query": {
"match_phrase": {
"mark": "good day"
}
}
}
短语匹配,slop指定词项间隔离的范围
GET school/student/_search
{
"query": {
"match_phrase": {
"mark": {
"query":"good day",
"slop":1
}
}
}
}
短语前缀匹配查询(match_phrase_prefix)
slop指定词项间隔离的范围,max_expansions最多查到前缀多少个词项停止,默认50,默认在所有分片上,找到匹配到前缀的前50个词。
5.0之后可以使用"profile": true,可以看到一个搜索聚合请求,是如何拆分成底层的 Lucene 请求
执行步骤:
- 分析查询字符串,查找前50个前缀是t的词项;
- 只保留包含全部词项的文档,并且词项位置也相同。
GET school/student/_search
{
"query": {
"match_phrase_prefix": {
"mark": {
"query": "t",
"slop": 1,
"max_expansions": 50
}
}
}
}
multi_match,在多个字段上执行match查询
GET school/student/_search
{
"query": {
"multi_match": {
"query": "elasticsearch",
"fields": ["mark","course","name*"]
}
}
}
term精确值查找
1、term 查询被用于精确值匹配,这些精确值可以是数字(number)、日期(date)、布尔值(bool)、未经过分析的字符串(keyword);
2、term 查询对于输入的文本不分析,所以它将给定的值进行精确查询。
GET school/student/_search
{
"query": {
"term": {
"age": 25
}
}
}
GET school/student/_search
{
"query": {
"term": {
"course": "spring"
}
}
}
GET school/student/_search
{
"query": {
"term": {
"course": "spring elasticsearch"
}
}
}
由于term查询不需要进行查询词的分析,mapping定义中,mark字段是text,是经过词分析的,索引在倒排索引中没有happy day这个词,所以以下查询查不出任何结果
GET school/student/_search
{
"query": {
"term": {
"mark": "happy day"
}
}
}
和match做一下对比
GET school/student/_search
{
"query": {
"match": {
"mark": "happy day"
}
}
}
terms 查询
terms 查询和 term 查询一样,但它允许你指定多值进行匹配。
如果这个字段包含了指定值中的任何一个值,那么这个文档满足条件和 term 查询一样,terms 查询对于输入的文本不分析。
GET school/student/_search
{
"query": {
"terms": {
"name": ["zhangsan","lisi"]
}
}
}
效果跟上边的match一样
GET school/student/_search
{
"query": {
"terms": {
"mark": ["happy","day"]
}
}
}
range范围查询
range范围查询可以用于数字、日期等类型的字段。
gt:大于,gte:大于等于,lt:小于,lte:小于等于。
GET school/student/_search
{
"query": {
"range": {
"age": {
"gte": 20,
"lt": 30
}
}
}
}
range范围查询,可以定义日期格式
GET school/student/_search
{
"query": {
"range": {
"study_date": {
"gte": "2017-01-01",
"lte": "2018",
"format": "yyyy-MM-dd||yyyy"
}
}
}
}
时间格式可以写成now-1d/d的形式,自动转换为前天0点
gt 大于一个日期。向上取,2014-11-18||/M -> 2014-11-30T23:59:59.999
gte 大于等于一个日期。向下取,2014-11-18||/M -> 2014-11-01
lt 小于一个日期。向下取,2014-11-18||/M -> 2014-11-01
lte 小于等于一个日期。向上取2014-11-18||/M -> 2014-11-30T23:59:59.999
GET school/student/_search
{
"query": {
"range": {
"study_date": {
"gte": "now-10d/d",
"lt": "now+1M/d",
"time_zone": "+08:00"
}
}
}
}
搜索某个字段含有值的文档(exists)
GET school/student/_search
{
"query": {
"exists" : { "field" : "mark" }
}
}
搜索某个字段没有值的文档
GET school/student/_search
{
"query": {
"bool": {
"must_not": {
"exists": {
"field": "mark"
}
}
}
}
}
前缀查询
GET school/student/_search
{
"query": {
"prefix": {
"name": "zhang"
}
}
}
通配符查询(wildcard)
GET school/student/_search
{
"query": {
"wildcard" : { "name" : "zha*san" }
}
}
正则表达式查询(regexp)
GET school/student/_search
{
"query": {
"regexp":{
"name": "z.*san"
}
}
}
模糊查询,用于拼写错误的词查询(fuzzy)
GET school/student/_search
{
"query": {
"fuzzy": {
"name": {
"value": "zhangsi",
"fuzziness": 2
}
}
}
}
组合查询(constant score)
组合查询,不使用评分计算,提高效率,返回统一评分,评分为1。
GET school/student/_search
{
"query": {
"constant_score": {
"filter": {
"term": {
"mark": "day"
}
}
}
}
}
组合查询,filter过滤有缓存,全部返回评分为0
GET /school/student/_search
{
"query": {
"bool": {
"filter": {
"term": {
"age": 25
}
}
}
}
}
bool组合查询
must:所有的语句都 必须(must) 匹配,与 AND 等价。
must_not:所有的语句都 不能(must not) 匹配,与 NOT 等价。
should:至少有一个语句要匹配,与 OR 等价。
POST /school/student/_search
{
"query": {
"bool": {
"must": {
"range": {
"age": {"gte": 20,"lt": 30}
}
},
"must_not": {
"match": {
"mark": "good"
}
},
"should": [
{"term": {"name": "zhangsan"}},
{"term": {"name": "lisi"}},
{"term": {"name": "zhaoliu"}}
],
"filter": {
"term": {
"course": "elasticsearch"
}
},
"minimum_should_match": 1
}
}
}
####################################
query string query
GET /school/_search
{
"query": {
"query_string" : {
"query" : "+mark:zhangsan -name:aaabbb"
}
}
}
GET /school/_search
{
"query": {
"query_string" : {
"query" : "name:(zhangsan OR lisi) AND NOT course:spring"
}
}
}
查询与过滤:
- 尽量使用Filter代替Query
- query搜索需要计算相关度评分并排序,无法使用缓存;
- filter过滤无需计算相关度评分,可以使用缓存。
- 尽量使用Bool组合代替AND OR
- bool使用must、must_not、should、filter条件可以复用,结果保存在bitset中,做交集效率高;
- and/or逐个文档处理、检查是否匹配,效率低。把过滤多的文档条件放在最前面。
原则上来说,使用查询语句来做全文本搜索或其他需要进行相关性评分,剩下的全部用过滤语句。