elasticsearch

1、初步检索

1.1>、_cat

GET /_cat/nodes    查看所有节点
GET /_cat/health   查看es健康状况
GET /_cat/master   查看主节点
GET /_cat/indices  查看所有索引,相当于 mysql 的 show databases

1.2>、索引一个文档(保存)

保存一个数据,保存在哪个索引的哪个类型下,指定用哪个唯一标识

PUT customer/external/1; 在 customer 索引下的 external 类型下保存1号数据为

PUT customer/external/1
{
  "name":"Tom"
}

PUT 和 POST 都可以
POST 新增。如果不指定 id,会自动生成 id。指定 id 就会修改这个数据,并新增版本号
PUT 可以新增可以修改。PUT 必须指定id,由于PUT需要指定id,我们一般都用来做修改操作,不指定id会报错

1.3>、查询文档

GET customer/external/1

结果:
{
  "_index" : "customer",     // 在哪个索引
  "_type" : "external",      // 在哪个类型
  "_id" : "1",               // 记录id
  "_version" : 1,            // 版本号
  "_seq_no" : 0,             // 并发控制字段,每次更新就会+1,用来做乐观锁
  "_primary_term" : 1,       // 同上,主分片重新分配,如重启,就会变化
  "found" : true,
  "_source" : {
    "name" : "Tom"
  }
}

更新携带 ?if_seq_no=0&if_primary_term=1

1.4>、put&post修改数据

POST customer/external/1/_update
{
  "doc":{
    "name":"John"
  }
}
或者
POST customer/external/1
{
  "name":"Jerry"
}
或者
PUT customer/external/1
{
  "name":"Jerry2"
}
  • 不同:POST 操作会对比源文档数据,如果相同不会有什么操作,文档 version 不增加 PUT 操作总会将数据重新保存并增加 version 版本
    带_update对比元数据如果一样就不进行任何操作
    看场景:
    对于大并发更新,不带 update
    对于大并发查询偶尔更新,带 update;对比更新,重新计算分配规则
  • 更新同时增加属性
POST customer/external/1/_update
{
 "doc":{
   "name":"Haha",
   "age":20
 }
}

1.5>、删除文档&索引

DELETE customer/external/1
DELETE customer

1.6>、bulk批量API

POST customer/external/_bulk
{"index":{"_id":"1"}}
{"name":"John Doe"}
{"index":{"_id":"2"}}
{"name":"Tom"}

https://github.com/elastic/elasticsearch/blob/master/docs/src/test/resources/accounts.json

POST /bank/account/_bulk
{"index":{"_id":"1"}}
{"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
{"index":{"_id":"6"}}
{"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671 Bristol Street","employer":"Netagy","email":"hattiebond@netagy.com","city":"Dante","state":"TN"}
{"index":{"_id":"13"}}
......
......
......
{"index":{"_id":"990"}}
{"account_number":990,"balance":44456,"firstname":"Kelly","lastname":"Steele","age":35,"gender":"M","address":"809 Hoyt Street","employer":"Eschoir","email":"kellysteele@eschoir.com","city":"Stewartville","state":"ID"}
{"index":{"_id":"995"}}
{"account_number":995,"balance":21153,"firstname":"Phelps","lastname":"Parrish","age":25,"gender":"M","address":"666 Miller Place","employer":"Pearlessa","email":"phelpsparrish@pearlessa.com","city":"Brecon","state":"ME"}

2、进阶检索

2.1>、SearchAPI

2.1.1)、检索信息
  • 一切检索从 _search 开始
GET bank/_search            						 # 检索 bank 下所有信息,包括type和docs
GET bank/_search?q=*&sort=account_number:asq         # 请求参数方式检索

响应结果解释:
took - Elasticsearch 执行搜索的时间(毫秒)
time_out - 告诉我们搜索是否超时
_shards - 告诉我们多少个分片被搜索了,以及统计了成功/失败的搜索分片
hits - 搜索结果
hits.total - 搜索结果
hits.hits - 实际的搜索结果数组(默认为前10的文档)
sort - 结果的排序 key(键)(没有则按 score 排序)
score 和 max_score - 相关性得分和最高得分(全文检索用)
  • url+请求体进行检索
GET /bank/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "account_number": {
        "order": "asc"
      }
    },
    {
      "balance": "desc"
    }
  ]
}

2.2>、Query DSL

2.2.1)、基本语法格式

Elasticsearch 提供了一个可以执行查询的 Json 风格的 DSL(domain-specific language 领域特定语言)。这个被称为 Query DSL。该查询语言非常全面。

  • 典型结构
{
  QUERY_NAME:{
    ARGUMENT:VALUE,
    ARGUMENT:VALUE,
    ...
  }
}
  • 如果是针对某个字段,那么它的结构如下:
{
  QUERY_NAME:{
  	FIELD_NAME:{
     	ARGUMENT:VALUE,
    	ARGUMENT:VALUE,
    	...
  	}
  }
}
2.2.2)、返回部分字段
GET /bank/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "account_number": {
        "order": "asc"
      }
    },
    {
      "balance": "desc"
    }
  ],
  "from": 0,
  "size": 5,
  "_source": [
    "balance",
    "firstname"
  ]
}
2.2.3)、match [匹配查询]
  • 基本类型(非字符串),精确匹配
GET /bank/_search
{
  "query": {
    "match": {
      "account_number": "20"
    }
  }
}
  • 字符串,全文检索
GET bank/_search
{
  "query": {
    "match": {
      "address": "Kings"
    }
  }
}
2.2.4)、match_phrase[短语匹配]

将需要匹配的值当成一个整体单词(不分词)进行检索

GET bank/_search
{
  "query": {
    "match_phrase": {
      "address": "mill road"
    }
  }
}

keyword 精确匹配

GET bank/_search
{
  "query": {
    "match": {
      "address.keyword": "mill road"
    }
  }
}
2.2.5)、multi_match[多字段匹配]

state 或者 address 包含 mill

GET bank/_search
{
  "query": {
    "multi_match": {
      "query": "mill",
      "fields": ["state","address"]
    }
  }
}
2.2.6)、bool[复合查询]
GET bank/_search
{
  "query": {
    "bool": {
      "must": [
        {"match": {
          "gender": "M"
        }},
        {
          "match": {
            "address": "mill"
          }
        }
      ],
      "must_not": [
        {"match": {
          "age": "18"
        }}
      ],
      "should": [
        {"match": {
          "lastname": "Wallace"
        }}
      ]
    }
  }
}
2.2.7)、filter[结果过滤]

== 并不是所有的查询都需要产生分数,特别是那些仅用于 “filterlng” (过滤)的文档,为了不计算分数 Elasticsearch 会自动检索场景并且优化查询的执行。==

GET bank/_search
{
  "query": {
    "bool": {
      "filter": {
        "range": {
          "age": {
            "gte": 10,
            "lte": 30
          }
        }
      }
    }
  }
}
2.2.8)、term

和 match 一样。匹配某个属性的值,全文检索字段用 math,其他非 text 字段匹配用 term。

GET bank/_search
{
  "query": {
    "term": {
      "balance": {
        "value": "32838"
      }
    }
  }
}
2.2.9)、aggregations (执行聚合)

聚合提供了从数据中分组和提取数据的能力。最简单的聚合方法大致等于 SQL GROUP BY 和 SQL 聚合函数。在 elasticsearch 中,执行搜索返回 hits(命中结果),并且同时返回聚合结果,把一个响应的所有 hits(命中结果) 分隔开的能力。这是非常强大且有效的,可以执行查询和多个聚合,并且在一次使用中得到各自的(任何一个的) 返回结果,使用一次简洁和简化的 API 来避免网络往返。

  • 搜索 address 中包含mill的所有人的年龄分布以及平均年龄,但不显示这些人的详情
GET bank/_search
{
  "query": {
    "match": {
      "address": "mill"
    }
  },
  "aggs": {
    "ageAgg": {
      "terms": {
        "field": "age",
        "size": 10
      }
    },
    "ageAvg": {
      "avg": {
        "field": "age"
      }
    }
  }
}
  • 按照年龄聚合,并且求这些年龄段的人的平均薪资
GET bank/_search
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "ageAgg": {
      "terms": {
        "field": "age",
        "size": 100
      },
      "aggs": {
        "avgAgg": {
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  }
}
  • 查出所有年龄分布,并且这些年龄段中M的平均薪资和F的平均薪资以及这个年龄段的总体平均薪资
GET bank/_search
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "ageAgg": {
      "terms": {
        "field": "age",
        "size": 100
      },
      "aggs": {
        "genderAgg": {
          "terms": {
            "field": "gender.keyword",
            "size": 10
          },
          "aggs": {
            "balancdAvg": {
              "avg": {
                "field": "balance"
              }
            }
          }
        },
        "ageBalancdAvg": {
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  }
}

3、Mapping

3.1>、字段类型

3.2>、映射

GET /bank/_mapping


PUT /my_index
{
  "mappings": {
    "properties": {
      "age": {"type": "integer"},
      "email": {"type": "keyword"},
      "name": {"type": "text"}
    }
  }
}

3.3>、添加新的字段映射

PUT /my-index/_mapping
{
  "properties": {
    "employee_id": {
      "type": "keyword",
      "index": false
    }
  }
}

3.4>、更新映射

对于已经存在的映射字段,我们不能更新。更新必须创建新的索引进行数据迁移

3.5>、数据迁移

先创建出 new_twitter 的正确映射,然后使用如下方式进行数据迁移

POST _reindex [固定写法]
{
  "source":{
    "index":"twitter"
  },
  "dest":{
    "index":"new_twitter"
  }
}

将旧索引的type下的数据进行迁移

POST _reindex
{
  "source":{
    "index":"twitter,
    "type":"tweet"
  },
  "dest":{
    "index":"new_twitter"
  }
}

4、分词

4.1>、安装ik分词

cd /mydata/elasticsearch/plugins

wget https://github.com/medcl/elasticsearch-analysis-ik.git

4.2>、自定义扩展词库

  1. 使用docker安装nginx,见docker实战
  2. 编写配置文件
/mydata/elasticsearch/plugins/ik/config

vim IKAnalyzer.cfg.xml

<!--用户可以在这里配置远程扩展字典 -->
<entry key="remote_ext_dict">http://192.168.52.10/es/fenci.txt</entry>

docker restart elasticsearch

5、SpringBoot 整合 high-level-client

5.1>、添加依赖

<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-high-level-client</artifactId>
    <version>7.4.2</version>
</dependency>

5.2>、修改 SpringBoot 整合的Elasticsearch默认版本

<properties>
	<elasticsearch.version>7.4.2</elasticsearch.version>
</properties>

5.3>、添加配置

package com.xgxz.shopmall.search.config;

import org.apache.http.HttpHost;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class ElasticsearchConfig {

    public static final RequestOptions COMMON_OPTIONS;

    static {
        RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
        COMMON_OPTIONS = builder.build();
    }

    @Bean
    public RestHighLevelClient esRestClient(){

        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(
                        new HttpHost("192.168.52.10",9200,"http")
                )
        );

        return client;

    }
}
上一篇:FPGA芯片代换工作总结


下一篇:Go语言编程笔记9:使用共享变量实现并发