ElasticSearch

一、中⽂分词器的原理

单字分词：就是按照中⽂⼀个字⼀个字的进⾏分词，⽐如:"我们是中国⼈"，分词的效果就是"我"，"们"，"是"，"中"，"国"，"⼈"，StandardAnalyzer分词法就是单字分词。
⼆分法分词：按照两个字进⾏切分，⽐如："我们是中国⼈"，分词的效果就是："我们"，"们是"，"是中"，"中国"，"国⼈"，CJKAnalyzer分词法就是⼆分法分词
词库分词：按照某种算法构造词，然后去匹配已建好的词库集合，如果匹配到就切分出来成为词语，通常词库分词被认为是最好的中⽂分词算法，如："我们是中国⼈"，分词的效果就是:"我们"，"中国⼈"，极易分词 MMAnalyzer、庖丁分词、IkAnalyzer等分词法就是属于词库分词。

常用中文分词器：IKAnalyzer

二、IK 分词器

ik有两种分词模式：ik_max_word和ik_smart模式;

ik_max_word 和 ik_smart 什么区别?

ik_max_word: 会将⽂本做最细粒度的拆分，⽐如会将“中华⼈⺠共和国国歌”拆分为“中华⼈⺠共和国,中华⼈⺠, 中华,华⼈,⼈⺠共和国,⼈⺠,⼈,⺠,共和国,共和,和国,国歌”，会穷尽各种可能的组合；
ik_smart: 会做最粗粒度的拆分，⽐如会将“中华⼈⺠共和国国歌”拆分为“中华⼈⺠,共和国,国歌”。索引时，为了提供索引的覆盖范围，通常会采⽤ik_max_word分析器，会以最细粒度分词索引，搜索时为了提⾼搜索准确度，会采⽤ik_smart分析器，会以粗粒度分词

三、 ElasticSearch安装

出于安全考虑，elasticsearch默认不允许以root账号运⾏

创建用户设置密码

[root@localhost ~]# useradd es
[root@localhost ~]# passwd es
Changing password for user es.
New password: 
Retype new password:
[root@localhost ~]# chmod 777 /usr/local 【授予es⽤户/usr/local⽬录 
可读可写可执⾏权限】
[root@localhost ~]# su - es
[es@localhost ~]$

检查JDK版本（需要jdk1.8+）

[es@localhost ~]# java -version
openjdk version "1.8.0_222-ea"
OpenJDK Runtime Environment (build 1.8.0_222-ea-b03)
OpenJDK 64-Bit Server VM (build 25.222-b03, mixed mode)

将ES的压缩包上传至/usr/local目录并解压

[es@localhost local]$ tar -zxvf elasticsearch-7.6.1-linux-x86_64.tar.gz

查看配置文件

[es@localhost local]# cd elasticsearch-7.6.1/config/
[es@localhost config]# ls
elasticsearch.yml jvm.options log4j2.properties role_mapping.yml
roles.yml users users_roles

修改jvm.options

ElasticSearch基于Lucene的，⽽Lucene底层是java实现，因此我们需要配置jvm参数

[es@localhost config]# vim jvm.options
# 默认配置如下
# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space
-Xms1g
-Xmx1g

修改elasticsearch.yml

修改集群节点信息

# ---------------------------------- Cluster -----------------------
------------17
cluster.name: my-application
# ------------------------------------ Node ------------------------
------------23
node.name: node-1
# --------------------------------- Discovery ----------------------
------------72
cluster.initial_master_nodes: ["node-1"]

修改数据文件和日志文件存储目录路径（如果⽬录不存在则需创建）

[root@localhost config]# vim elasticsearch.yml
# ---------------------------- Paths ------------------------------
path.data: /usr/local/elasticsearch-7.6.1/data
path.logs: /usr/local/elasticsearch-7.6.1/logs

修改绑定的ip，默认只允许本机访问，修改为0.0.0.0后则可以远程访问

# ---------------------------- Network -----------------------------
-
# 默认只允许本机访问，修改为0.0.0.0后则可以远程访问
network.host: 0.0.0.0

配置信息说明

⽬前我们是做的单机安装，如果要做集群，只需要在这个配置⽂件中添加其它节点信息即可。

elasticsearch.yml的其它可配置信息：

image.png

进⼊elasticsearch/bin⽬录运⾏

localhost elasticsearch-7.6.1]# cd /usr/local/elasticsearch-7.6.1/bin
[es@localhost elasticsearch-7.6.1]# ./elasticsearch

* soft nofile 666666666 
  * hard nofile 131072     
  * soft nproc 4096        
  * hard nproc 4096

四、Kibana

Kibana是⼀个基于Node.js的Elasticsearch索引库数据统计⼯具，可以利⽤Elasticsearch的聚合功能，⽣成各种图表，如柱形图，线状图，饼图等。⽽且还提供了操作Elasticsearch索引数据的控制台，并且提供了⼀定的API提示，⾮常有利于我们学习Elasticsearch 的语法。

6.1 安装

kibana版本与elasticsearch保持⼀致，也是7.6.1解压到特定⽬录即可

tar -zxvf kibana-7.6.1-linux-x86_64.tar.gz

6.2 配置

进⼊安装⽬录下的config⽬录，修改kibana.yml⽂件：

server.port: 5601
server.host: "0.0.0.0"

6.3 运⾏

进⼊安装⽬录下的bin⽬录启动： kibana的监听端⼝是5601

./kibana

在浏览器输入IP：5601进入Kibana操作界面

五、安装IK分词器

IK分词器的压缩包windows与Linux通用，所以在本地解压后上传到elasticsearch-7.6.1/plugins目录下即可。

然后重启ES服务。

六、ES基本操作

ES是⽀持web访问的，但必须遵从RESTful访问规范

ES逻辑结构

数据库：数据是存储在数据表中的，数据表是创建在数据库中的
ES：document是存储在type中的，type是创建在index中
- index 索引 --- 相当于数据库（索引的命名不能包含特殊字符，必须⼩写）
- type类型 --- 相当于数据表（在es7以前，⼀个index中可以创建多个type ）
- document⽂档 --- 相当于数据表中的⼀条记录

RESTful

请求⽅式	REST请求	功能描述
PUT	http://eshost:9200/index1	创建index(索引)
POST	http://eshost:9200/索引名/类型名/⽂档ID	添加document
POST	http://eshost:9200/索引名/类型名/⽂档 ID/_update	修改document⽂档
DELETE	http://eshost:9200/索引名/类型名/⽂档ID	根据ID删除 document
GET	http://eshost:9200/索引名/类型名/⽂档ID	根据ID查询 document
POST	http://eshost:9200/索引名/类型名/_search	查询索引下所有数据

七、SpringBoot整合ES

7.1 添加es的依赖

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>

7.2配置Bean

在SpringBoot应⽤中已经提供了RestHighLevelClient实例，⽆需进⾏实例配置，但是需要进⾏es服务器地址配置

@Bean
public RestHighLevelClient getRestHighLevelClient(){
    HttpHost httpHost = new HttpHost("101.168.179", 9200, "http");
    RestClientBuilder restClientBuilder = RestClient.builder(httpHost);
    RestHighLevelClient restHighLevelClient = 
        new RestHighLevelClient(restClientBuilder);
    return restHighLevelClient;
}

在SpringBoot应⽤配置连接：

spring:
    elasticsearch:
        rest:
            uris: http://101.168.179:9200

八、使用ES实现商品检索

在平台管理系统中的商品添加功能中，当商家向商品表添加并上架一个商品时同步向ES添加一个商品；商家下架一个商品就从ES中删除一个商品

系统运行前期数据量小没有使用ES，当数据量增长之后使用ES时，需要将数据库现有的数据导入到ES（导入工作需要在项目部署到生产环境前完成）

8.1 创建索引

CreateIndexRequest createIndexRequest = new CreateIndexRequest("mallproductsindex");
CreateIndexResponse createIndexResponse = restHighLevelClient.indices().create(createIndexRequest, RequestOptions.DEFAULT);
System.out.println(createIndexResponse.isAcknowledged());

8.2 将商品信息导入ES

8.2.1 定义ES存储数据的对象结构

@Data
@AllArgsConstructor
@NoArgsConstructor
public class Product4ES {
    private String productId;
    private String productName;
    private String productImg;
    private int soldNum;
    private String productSkuName;
    private double productSkuPrice;
}

8.2.2 查询所有商品信息并导入

注入RestHighLevelClient实例

public void testImportData(){
    //1.从数据库查询数据
    List<ProductVO> productVOS = productMapper.selectProductVOS();
    System.out.println(productVOS.size());
    //2.将查询到的数据写入到ES
    for(ProductVO p : productVOS){
        String productId = p.getProductId();
        String productName = p.getProductName();
        Integer soldNum = p.getSoldNum();

        List<ProductSku> productSku = p.getSkus();

        String skuName = productSku.size()==0 ? "" : productSku.get(0).getSkuName();
        String skuImg = productSku.size()==0 ? "" : productSku.get(0).getSkuImg();
        Integer sellPrice = productSku.size()==0 ? 0 : productSku.get(0).getSellPrice();
        //构造ES存储数据的对象
        Product4ES product4ES = new Product4ES(productId, productName, skuImg, soldNum, skuName, sellPrice);

        //存入ES
        try {
            IndexRequest indexRequest = new IndexRequest("mallproductsindex");
            indexRequest.id(productId);
            indexRequest.source(objectMapper.writeValueAsString(product4ES), XContentType.JSON);
            restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

8.2.3 从ES中进行商品检索

在ProductServiceImpl中修改

public ResultVO searchProduct(String kw, int pageNum, int limit) {

    try {
        //1.查询搜索结果
        int start = (pageNum-1)*limit;
        //从ES查询数据
        SearchRequest searchRequest = new SearchRequest("mallproductsindex");
        //查询条件
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        searchSourceBuilder.query(QueryBuilders.multiMatchQuery(kw,"productName","productSkuName"));
        //分页条件
        searchSourceBuilder.from(start);
        searchSourceBuilder.size(limit);
        //高亮显示
        HighlightBuilder highlightBuilder = new HighlightBuilder();
        HighlightBuilder.Field field1 = new HighlightBuilder.Field("productName");
        HighlightBuilder.Field field2 = new HighlightBuilder.Field("productSkuName");
        highlightBuilder.field(field1);
        highlightBuilder.field(field2);
        highlightBuilder.preTags("<label style='color:red'>");
        highlightBuilder.postTags("</label>");
        searchSourceBuilder.highlighter(highlightBuilder);
        searchRequest.source(searchSourceBuilder);
        //执行检索
        SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT) ;

        //封装查询结果
        SearchHits hits = searchResponse.getHits();
        int count = (int)(hits.getTotalHits().value);

        //3.计算总页数
        int pageCount = count%limit==0 ? count/limit : count/limit+1;

        Iterator<SearchHit> iterator = hits.iterator();
        List<Product4ES> products = new ArrayList<>();
        while(iterator.hasNext()){
            SearchHit searchHit = iterator.next();
            Product4ES product4ES = objectMapper.readValue(searchHit.getSourceAsString(), Product4ES.class);
            //获取高亮字段
            Map<String, HighlightField> highlightFields = searchHit.getHighlightFields();
            //productName
            HighlightField highlightField1 = highlightFields.get("productName");
            if(highlightField1!=null){
                String highLightProductName = Arrays.toString(highlightField1.fragments());
                product4ES.setProductName(highLightProductName);
            }
            products.add(product4ES);
        }

        //4.封装，返回数据
        PageHelper<Product4ES> productVOPageHelper = new PageHelper<>(count,pageCount,products);
        ResultVO resultVO = new ResultVO(ResStatus.OK.getCode(), "success", productVOPageHelper);

        return resultVO;

    } catch (IOException e) {
        e.printStackTrace();
    }
    return null;
}