ES大小写问题

这个问题遇到过两次
一次是设备类型大小写问题，数据存入的是类似OPPO|VIVI，但是使用term方式查询不到，一次是国家编码的大小写问题，中国的存储的编码是CN，使用term方式也查询不到

问题描述：

字段value为text类型，具体值为国家代码，比如‘CN’
但是使用term查询不到

1.png

使用match可以查询到

2.png

此处需要精确匹配的查询，不想被分词，具体查了下原因

原因：

1、term查询不到

使用term要确定的是这个字段是否“被分析”(analyzed)，默认的字符串是被分析的。创建对应的索引之后，通过term查会失败

[@bjzw_113_167 shell]# curl -X GET http://10.136.12.78:9200/app_hotlist_country/app_hotlist_country/_mapping?pretty
{
  "app_hotlist_country" : {
    "mappings" : {
      "app_hotlist_country" : {
        "properties" : {
          "app_id" : {
            "type" : "keyword"
          },
          "app_name" : {
            "type" : "text",
            "analyzer" : "ik_max_word"
          },
          ~
          "value" : {
            "type" : "text"  --------------此处是国家编码存储mapping类型
          }
        }
      }
    }
  }
}

2、match能查询到

使用_analyze分析CN，数据存储是默认进行了分词建立索引，查询时又进行了分词查询

[@bjzw_113_167 shell]# curl -H "Content-Type: application/json" -XGET 'http://10.136.12.78:9200/app_hotlist_country/_analyze?pretty' -d '
> {   
> "text": "CN"
> }'
{
  "tokens" : [
    {
      "token" : "cn",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "<ALPHANUM>",
      "position" : 0
    }
  ]
}

参考

解决

1、只是将mapping中type改为keyword

这种方式使用term能查询到，但是仅限于查询字符串跟字段完全匹配

"value" : {
            "type" : "keyword"
          }

2、使用normalizer

The normalizer property of keyword fields is similar to analyzer except that it guarantees that the analysis chain produces a single token.

normalizer 跟analyzer类似，normalizer特殊的地方是会确定的产生一个token

The normalizer is applied prior to indexing the keyword, as well as at search-time when the keyword field is searched via a query parser such as the match query or via a term level query such as the term query.

normalizer应用在关键字的索引建立之前，或者是通过match或者term查询的时候

参考

"settings":{
  "analysis":{
   "normalizer":{
    "my_normalizer":{
     "type":"custom",
     "filter":["lowercase","asciifolding"]
    }
   }
  }
 }, 
  "mappings":{
  "type":{
   "properties":{
    "foo":{
     "type":"keyword",
     "normalizer":"my_normalizer"
    }
   }
  }
 }

最后编辑于：2019.03.21 10:52:01