Elasticsearch-脚本

ES Script

  1. 语言与es版本:

    • Groovy:Es版本在1.4-5.0。
    • Painless:是一种专门用于Elasticsearch的简单语言,用于内联和存储脚本,类似于java也有注释、关键词、类型、变量和函数等。安全的脚本语言,是ES的默认脚本语言,可以安全低用于内联和存储脚本。
    • 其他脚本:
      1. expression:执行非常开的脚本语言,甚至比native脚本还快,支持javascript语法子集(单个表达式)。缺点是:只能访问数字、布尔、日期和geo_point字段,存储的字段不可用。
      2. mustache:提供模板参数化查询。
      3. java:java(专家API)
  2. painless语法

    • 基本语法:

      "script": {
         "lang": "...",
         "source" | "id": "...",
         "params": { ...
         }
      }
      
    • lang :默认值painless。实际使用可以不设置,除非第二种语言供使用。

    • source:可以为inline脚本或者id,id对应一个stored脚本。

    • 任何有名字的参数,可以被用于脚本的熟肉参数。

  3. expression脚本

    GET product2/_search
    {
      "script_fields": {
        "test_field": {
          "script": {
            "lang":   "expression",
            "source": "doc['price']"
          }
        }
      }
    }
    
  1. inline脚本

    • 创建简单文档

      PUT twitter/_doc/1
      {
         "user": "双榆树-张三",
         "message": "今儿天气不错啊,出去转转去",
         "uid": 2,
         "age": 20,
         "city": "北京",
         "province": "北京",
         "country": "中国",
         "address": "中国北京市海淀区",
         "location": {
             "lat": "39.970718",
             "lon": "116.325747"
         }
      }    
      
    • 将 age 改为 30

      POST /twitter/_update/1
      {
        "script": {
          "lang": "painless"
          , "source": """
            ctx._source.age=30  
          """
        }
      }
      #运行结果
      {
        "_index" : "twitter",
        "_type" : "_doc",
        "_id" : "1",
        "_version" : 2,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 1,
        "_primary_term" : 1
      }    
      
  • 概念:"source": """ 表明是painless代码,这种代码写在DSL中,这种代码成为inline。

  • 上下文:ctx成为上下文,用ctx._source访问对象的 _source 信息

  1. 脚本参数

    • 脚本参数:inline中 script 中“ ctx._source.age=30 ”,这种实现方式会造成脚本每次执行都需要重新编译。编译好的script可以cache并共以后使用。

    • script改进:

      POST /twitter/_update/1
      {
        "script": {
          "lang": "painless"
          , "source": """
            ctx._source.age = params.value  
          """
          , "params": {
            "value":34
          }
        }
      }
      
      # 增加params参数,并在脚本中通过params.参数名称进行引用
      
  1. 存储脚本(stored script)
    • 概念:script可以存储在一个集群cache中。可以通过ID进行调用,作用域为整个集群,只有发生变更时重新编译。

    • 参数设置:

      1. script.cache.expire:设置存储脚本的过期时间,默认没有过期时间。
      2. script.cache.max_size:设置缓存大小,默认100MB。
      3. script.max_size_in_bytes:设置脚本大小,默认最大64MB。
    • 代码示例:

      # 定义一个Id为add_age的存储脚本
      POST /_scripts/add_age
      {
        "script":{
          "lang" : "painless"
          , "source": """
            ctx._source.age = params.age
          """
        }
      }
      ## 引用存储脚本
      POST /twitter/_update/1
      {
        "script": {
          "id": "add_age"
          , "params": {
            "age":45
          }
        }
      }
      
      
  1. 访问source里的字段
    • 概述:Painless中用于访问上下文中的语法取决于上下文。 在ES中,有许多不同的Painless上下文。
    • Painless上下文:包括ingest processor,update, update by query,sort,filter等。
    • Content访问字段:Ingest node 访问字段用 ctx.field_name; updates 访问字段用ctx._source.field_name
    • updates:包含update、reindex和update by query。
  2. Painless脚本样例-1(ctx.field_name)

    # 利用pipeline 创建一个字段
    PUT _ingest/pipeline/add_field_c
    {
      "processors": [
        {"script": {
          "lang": "painless"
          , "source": """
            ctx.field_c = (ctx.field_a + ctx.field_b) * params.value
          """
          , "params": {
              "value":2
          }
        }}
      ]
    }
    # 创建文档
    PUT test_script/_doc/1?pipeline=add_field_c
    {
      "field_a":10,
      "field_b":24
    }
    # 创建文档执行结果
    {
      "_index" : "test_script",
      "_type" : "_doc",
      "_id" : "1",
      "_version" : 1,
      "result" : "created",
      "_shards" : {
        "total" : 2,
        "successful" : 1,
        "failed" : 0
      },
      "_seq_no" : 0,
      "_primary_term" : 1
    }
    #查询创建文档结果, 新增字段 field_c
    GET test_script/_search
    {
      "took" : 0,
      "timed_out" : false,
      "_shards" : {
        "total" : 1,
        "successful" : 1,
        "skipped" : 0,
        "failed" : 0
      },
      "hits" : {
        "total" : {
          "value" : 1,
          "relation" : "eq"
        },
        "max_score" : 1.0,
        "hits" : [
          {
            "_index" : "test_script",
            "_type" : "_doc",
            "_id" : "1",
            "_score" : 1.0,
            "_source" : {
              "field_c" : 68,
              "field_a" : 10,
              "field_b" : 24
            }
          }
        ]
      }
    }    
        
    
  1. Painless脚本样例-2(ctx.field_name)

    #利用pipeline,检查category,如果为空,则置为None
    # 创建pipeline
    PUT _ingest/pipeline/blogs_pipeline
    {
      "processors": [
        {"script": {
          "lang": "painless"
          , "source": """
            if(ctx.category == ""){
              ctx.category = "None"
            }
          """
        }}
      ] 
    }
    # 利用pipeline,检查修改category
    PUT test_script/_doc/2?pipeline=blogs_pipeline
    {
      "field_a":5,
      "field_b":10,
      "category":""
    }
    # 执行结果
    {
      "_index" : "test_script",
      "_type" : "_doc",
      "_id" : "2",
      "_version" : 1,
      "result" : "created",
      "_shards" : {
        "total" : 2,
        "successful" : 1,
        "failed" : 0
      },
      "_seq_no" : 1,
      "_primary_term" : 1
    }
    #查询文档信息 将category字段修改为“None”
    GET test_script/_doc/2
    {
      "_index" : "test_script",
      "_type" : "_doc",
      "_id" : "2",
      "_version" : 1,
      "_seq_no" : 1,
      "_primary_term" : 1,
      "found" : true,
      "_source" : {
        "field_a" : 5,
        "field_b" : 10,
        "category" : "None"
      }
    }    
    
  1. Painless脚本样例-3(ctx._source.field_name)

    #增加数据
    PUT test_source/_doc/1
    {
        "counter": 1,
        "tags": ["red"]
    }
    
    # 给id为1的doc增加颜色green
    POST /test_source/_update/1
    {
      "script": {
        "lang": "painless"
        , "source": """
          ctx._source.tags.add(params.color);
        """
        , "params": {
          "color":"green"
        }
      }
    }
    #查询doc
    GET test_source/_doc/1
    {
      "_index" : "test_source",
      "_type" : "_doc",
      "_id" : "1",
      "_version" : 2,
      "_seq_no" : 1,
      "_primary_term" : 1,
      "found" : true,
      "_source" : {
        "counter" : 1,
        "tags" : [
          "red",
          "green"
        ]
      }
    }
    
  1. Painless脚本样例-4(ctx._source.field_name)

    # 列表中删除元素
    POST /test_source/_update/1
    {
      "script": {
        "lang": "painless"
        , "source": """
          if(ctx._source.tags.contains(params.color)){
            ctx._source.tags.remove(ctx._source.tags.indexOf(params.color))
          }
        """
        , "params": {
          "color":"red"
        }
      }
    }
    
  1. Painless脚本获取文档值:doc['field'].value和params['_source']['field']

    • doc['field']【推荐】:导致该西段的条件被加载到内存(缓存),从而保证执行更快,但消耗更多的内存。另外,doc['field']只能获取简单类型,不能获取复杂类型(如object 和 nested类型)。

    • params['_source']['field']:每次使用时都必须加载并解析,性能比较低。

  2. Painless脚本简单操练初始化数据

    PUT hockey/_bulk?refresh
    
    {"index":{"_id":1}}
    {"first":"johnny","last":"gaudreau","goals":[9,27,1],"assists":[17,46,0],"gp":[26,82,1],"born":"1993/08/13"}
    {"index":{"_id":2}}
    {"first":"sean","last":"monohan","goals":[7,54,26],"assists":[11,26,13],"gp":[26,82,82],"born":"1994/10/12"}
    {"index":{"_id":3}}
    {"first":"jiri","last":"hudler","goals":[5,34,36],"assists":[11,62,42],"gp":[24,80,79],"born":"1984/01/04"}
    {"index":{"_id":4}}
    {"first":"micheal","last":"frolik","goals":[4,6,15],"assists":[8,23,15],"gp":[26,82,82],"born":"1988/02/17"}
    {"index":{"_id":5}}
    {"first":"sam","last":"bennett","goals":[5,0,0],"assists":[8,1,0],"gp":[26,1,0],"born":"1996/06/20"}
    {"index":{"_id":6}}
    {"first":"dennis","last":"wideman","goals":[0,26,15],"assists":[11,30,24],"gp":[26,81,82],"born":"1983/03/20"}
    {"index":{"_id":7}}
    {"first":"david","last":"jones","goals":[7,19,5],"assists":[3,17,4],"gp":[26,45,34],"born":"1984/08/10"}
    {"index":{"_id":8}}
    {"first":"tj","last":"brodie","goals":[2,14,7],"assists":[8,42,30],"gp":[26,82,82],"born":"1990/06/07"}
    {"index":{"_id":39}}
    {"first":"mark","last":"giordano","goals":[6,30,15],"assists":[3,30,24],"gp":[26,60,63],"born":"1983/10/03"}
    {"index":{"_id":10}}
    {"first":"mikael","last":"backlund","goals":[3,15,13],"assists":[6,24,18],"gp":[26,82,82],"born":"1989/03/17"}
    {"index":{"_id":11}}
    {"first":"joe","last":"colborne","goals":[3,18,13],"assists":[6,20,24],"gp":[26,67,82],"born":"1990/01/30"}    
    
  1. Painless脚本简单操练--计算玩家的总进球数

    GET hockey/_search
    {
      "script_fields": {
        "total_goals": {
          "script": {
            "lang": "painless"
            , "source": """
              int total = 0;
              for(int i=0; i< doc['goals'].length; i++){
                total += doc['goals'][i];
              }
              return total;
            """
          }
        }
      }
    }
    
  1. Painless脚本简单操练--添加nickname为hockey新字段

    POST hockey/_update/1
    {
        "script": {
            "lang": "painless",
            "source": """
            ctx._source.last = params.last;
            ctx._source.nick = params.nick 
            """,
            "params": {
                "last": "gaudreau",
                "nick": "hockey"
            }
        }
    }    
    
  2. Painless脚本简单操练--操作日期类型

    GET hockey/_search
    {
      "script_fields": {
        "birth_year": {
          "script": {
            "source": "doc.born.value.year"
          }
        }
        ,"birth_month": {
          "script": {
            "source": "doc.born.value.month"
          }
        }
      }
    }
    
  1. Painless脚本简单操练--统计男性嫌疑人的数量

    • 初始化信息

      PUT test_index/_bulk?refresh
      {"index":{"_id":1}}
      {"ajbh": "12345","ajmc": "立案案件","lasj": "2020/05/21 13:25:23","jsbax_sjjh2_xz_ryjbxx_cleaning": [{"XM": "张三","NL": "30","SF": "男"},{"XM": "李四","NL": "31","SF": "男"},{"XM": "王五","NL": "30","SF": "女"},{"XM": "赵六","NL": "23","SF": "男"}]}
      {"index":{"_id":2}}
      {"ajbh": "563245","ajmc": "结案案件","lasj": "2020/05/21 13:25:23","jsbax_sjjh2_xz_ryjbxx_cleaning": [{"XM": "张三2","NL": "30","SF": "男"},{"XM": "李四2","NL": "31","SF": "男"},{"XM": "王五2","NL": "30","SF": "女"},{"XM": "赵六2","NL": "23","SF": "女"}]}
      {"index":{"_id":3}}
      {"ajbh": "12345","ajmc": "立案案件","lasj": "2020/05/21 13:25:23","jsbax_sjjh2_xz_ryjbxx_cleaning": [{"XM": "张三3","NL": "30","SF": "男"},{"XM": "李四3","NL": "31","SF": "男"},{"XM": "王五3","NL": "30","SF": "女"},{"XM": "赵六3","NL": "23","SF": "男"}]}
      
    • 统计男性嫌疑人的数量

      GET test_index/_search
      {
        "script_fields": {
          "total": {
            "script": {
              "lang": "painless"
              , "source": """
                int total = 0;
                for(int i=0; i< params['_source']['jsbax_sjjh2_xz_ryjbxx_cleaning'].length; i++){
                  if(params['_source']['jsbax_sjjh2_xz_ryjbxx_cleaning'][i]['SF'] == '男'){
                    total++;
                  }
                }
                return total;
              """
            }
          }
        }
        , "aggs": {
          "total": {
            "sum": {
              "script": {
                "lang": "painless"
                , "source": """
                    int total = 0;
                    for(int i=0; i< params['_source']['jsbax_sjjh2_xz_ryjbxx_cleaning'].length; i++){
                      if(params['_source']['jsbax_sjjh2_xz_ryjbxx_cleaning'][i]['SF'] == '男'){
                        total++;
                      }
                    }
                    return total;
                """
              }
            }
          }
        }
      }
      
      #执行结果
      {
        "took" : 6,
        "timed_out" : false,
        "_shards" : {
          "total" : 1,
          "successful" : 1,
          "skipped" : 0,
          "failed" : 0
        },
        "hits" : {
          "total" : {
            "value" : 3,
            "relation" : "eq"
          },
          "max_score" : 1.0,
          "hits" : [
            {
              "_index" : "test_index",
              "_type" : "_doc",
              "_id" : "1",
              "_score" : 1.0,
              "fields" : {
                "total" : [
                  3
                ]
              }
            },
            {
              "_index" : "test_index",
              "_type" : "_doc",
              "_id" : "2",
              "_score" : 1.0,
              "fields" : {
                "total" : [
                  2
                ]
              }
            },
            {
              "_index" : "test_index",
              "_type" : "_doc",
              "_id" : "3",
              "_score" : 1.0,
              "fields" : {
                "total" : [
                  3
                ]
              }
            }
          ]
        },
        "aggregations" : {
          "total" : {
            "value" : 8.0
          }
        }
      }
      
* 
POST /product2/_update/4
{
  "script": {
    "source": "ctx._source.price -=1"
  }
}

POST /product2/_update/4
{
  "script": "ctx._source.price -=1"
}

DELETE test_index;

PUT test_index/_bulk?refresh
{"index":{"_id":1}}
{"ajbh": "12345","ajmc": "立案案件","lasj": "2020/05/21 13:25:23","jsbax_sjjh2_xz_ryjbxx_cleaning": [{"XM": "张三","NL": "30","SF": "男"},{"XM": "李四","NL": "31","SF": "男"},{"XM": "王五","NL": "30","SF": "女"},{"XM": "赵六","NL": "23","SF": "男"}]}
{"index":{"_id":2}}
{"ajbh": "563245","ajmc": "结案案件","lasj": "2020/05/21 13:25:23","jsbax_sjjh2_xz_ryjbxx_cleaning": [{"XM": "张三2","NL": "30","SF": "男"},{"XM": "李四2","NL": "31","SF": "男"},{"XM": "王五2","NL": "30","SF": "女"},{"XM": "赵六2","NL": "23","SF": "女"}]}
{"index":{"_id":3}}
{"ajbh": "12345","ajmc": "立案案件","lasj": "2020/05/21 13:25:23","jsbax_sjjh2_xz_ryjbxx_cleaning": [{"XM": "张三3","NL": "30","SF": "男"},{"XM": "李四3","NL": "31","SF": "男"},{"XM": "王五3","NL": "30","SF": "女"},{"XM": "赵六3","NL": "23","SF": "男"}]}



©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。