1、初步检索
1.1>、_cat
GET /_cat/nodes 查看所有节点
GET /_cat/health 查看es健康状况
GET /_cat/master 查看主节点
GET /_cat/indices 查看所有索引,相当于 mysql 的 show databases
1.2>、索引一个文档(保存)
保存一个数据,保存在哪个索引的哪个类型下,指定用哪个唯一标识
PUT customer/external/1; 在 customer 索引下的 external 类型下保存1号数据为
PUT customer/external/1
{
"name":"Tom"
}
PUT 和 POST 都可以
POST 新增。如果不指定 id,会自动生成 id。指定 id 就会修改这个数据,并新增版本号
PUT 可以新增可以修改。PUT 必须指定id,由于PUT需要指定id,我们一般都用来做修改操作,不指定id会报错
1.3>、查询文档
GET customer/external/1
结果:
{
"_index" : "customer", // 在哪个索引
"_type" : "external", // 在哪个类型
"_id" : "1", // 记录id
"_version" : 1, // 版本号
"_seq_no" : 0, // 并发控制字段,每次更新就会+1,用来做乐观锁
"_primary_term" : 1, // 同上,主分片重新分配,如重启,就会变化
"found" : true,
"_source" : {
"name" : "Tom"
}
}
更新携带 ?if_seq_no=0&if_primary_term=1
1.4>、put&post修改数据
POST customer/external/1/_update
{
"doc":{
"name":"John"
}
}
或者
POST customer/external/1
{
"name":"Jerry"
}
或者
PUT customer/external/1
{
"name":"Jerry2"
}
- 不同:POST 操作会对比源文档数据,如果相同不会有什么操作,文档 version 不增加 PUT 操作总会将数据重新保存并增加 version 版本
带_update对比元数据如果一样就不进行任何操作
看场景:
对于大并发更新,不带 update
对于大并发查询偶尔更新,带 update;对比更新,重新计算分配规则 - 更新同时增加属性
POST customer/external/1/_update
{
"doc":{
"name":"Haha",
"age":20
}
}
1.5>、删除文档&索引
DELETE customer/external/1
DELETE customer
1.6>、bulk批量API
POST customer/external/_bulk
{"index":{"_id":"1"}}
{"name":"John Doe"}
{"index":{"_id":"2"}}
{"name":"Tom"}
https://github.com/elastic/elasticsearch/blob/master/docs/src/test/resources/accounts.json
POST /bank/account/_bulk
{"index":{"_id":"1"}}
{"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
{"index":{"_id":"6"}}
{"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671 Bristol Street","employer":"Netagy","email":"hattiebond@netagy.com","city":"Dante","state":"TN"}
{"index":{"_id":"13"}}
......
......
......
{"index":{"_id":"990"}}
{"account_number":990,"balance":44456,"firstname":"Kelly","lastname":"Steele","age":35,"gender":"M","address":"809 Hoyt Street","employer":"Eschoir","email":"kellysteele@eschoir.com","city":"Stewartville","state":"ID"}
{"index":{"_id":"995"}}
{"account_number":995,"balance":21153,"firstname":"Phelps","lastname":"Parrish","age":25,"gender":"M","address":"666 Miller Place","employer":"Pearlessa","email":"phelpsparrish@pearlessa.com","city":"Brecon","state":"ME"}
2、进阶检索
2.1>、SearchAPI
2.1.1)、检索信息
- 一切检索从 _search 开始
GET bank/_search # 检索 bank 下所有信息,包括type和docs
GET bank/_search?q=*&sort=account_number:asq # 请求参数方式检索
响应结果解释:
took - Elasticsearch 执行搜索的时间(毫秒)
time_out - 告诉我们搜索是否超时
_shards - 告诉我们多少个分片被搜索了,以及统计了成功/失败的搜索分片
hits - 搜索结果
hits.total - 搜索结果
hits.hits - 实际的搜索结果数组(默认为前10的文档)
sort - 结果的排序 key(键)(没有则按 score 排序)
score 和 max_score - 相关性得分和最高得分(全文检索用)
- url+请求体进行检索
GET /bank/_search
{
"query": {
"match_all": {}
},
"sort": [
{
"account_number": {
"order": "asc"
}
},
{
"balance": "desc"
}
]
}
2.2>、Query DSL
2.2.1)、基本语法格式
Elasticsearch 提供了一个可以执行查询的 Json 风格的 DSL(domain-specific language 领域特定语言)。这个被称为 Query DSL。该查询语言非常全面。
- 典型结构
{
QUERY_NAME:{
ARGUMENT:VALUE,
ARGUMENT:VALUE,
...
}
}
- 如果是针对某个字段,那么它的结构如下:
{
QUERY_NAME:{
FIELD_NAME:{
ARGUMENT:VALUE,
ARGUMENT:VALUE,
...
}
}
}
2.2.2)、返回部分字段
GET /bank/_search
{
"query": {
"match_all": {}
},
"sort": [
{
"account_number": {
"order": "asc"
}
},
{
"balance": "desc"
}
],
"from": 0,
"size": 5,
"_source": [
"balance",
"firstname"
]
}
2.2.3)、match [匹配查询]
- 基本类型(非字符串),精确匹配
GET /bank/_search
{
"query": {
"match": {
"account_number": "20"
}
}
}
- 字符串,全文检索
GET bank/_search
{
"query": {
"match": {
"address": "Kings"
}
}
}
2.2.4)、match_phrase[短语匹配]
将需要匹配的值当成一个整体单词(不分词)进行检索
GET bank/_search
{
"query": {
"match_phrase": {
"address": "mill road"
}
}
}
keyword 精确匹配
GET bank/_search
{
"query": {
"match": {
"address.keyword": "mill road"
}
}
}
2.2.5)、multi_match[多字段匹配]
state 或者 address 包含 mill
GET bank/_search
{
"query": {
"multi_match": {
"query": "mill",
"fields": ["state","address"]
}
}
}
2.2.6)、bool[复合查询]
GET bank/_search
{
"query": {
"bool": {
"must": [
{"match": {
"gender": "M"
}},
{
"match": {
"address": "mill"
}
}
],
"must_not": [
{"match": {
"age": "18"
}}
],
"should": [
{"match": {
"lastname": "Wallace"
}}
]
}
}
}
2.2.7)、filter[结果过滤]
== 并不是所有的查询都需要产生分数,特别是那些仅用于 “filterlng” (过滤)的文档,为了不计算分数 Elasticsearch 会自动检索场景并且优化查询的执行。==
GET bank/_search
{
"query": {
"bool": {
"filter": {
"range": {
"age": {
"gte": 10,
"lte": 30
}
}
}
}
}
}
2.2.8)、term
和 match 一样。匹配某个属性的值,全文检索字段用 math,其他非 text 字段匹配用 term。
GET bank/_search
{
"query": {
"term": {
"balance": {
"value": "32838"
}
}
}
}
2.2.9)、aggregations (执行聚合)
聚合提供了从数据中分组和提取数据的能力。最简单的聚合方法大致等于 SQL GROUP BY 和 SQL 聚合函数。在 elasticsearch 中,执行搜索返回 hits(命中结果),并且同时返回聚合结果,把一个响应的所有 hits(命中结果) 分隔开的能力。这是非常强大且有效的,可以执行查询和多个聚合,并且在一次使用中得到各自的(任何一个的) 返回结果,使用一次简洁和简化的 API 来避免网络往返。
- 搜索 address 中包含mill的所有人的年龄分布以及平均年龄,但不显示这些人的详情
GET bank/_search
{
"query": {
"match": {
"address": "mill"
}
},
"aggs": {
"ageAgg": {
"terms": {
"field": "age",
"size": 10
}
},
"ageAvg": {
"avg": {
"field": "age"
}
}
}
}
- 按照年龄聚合,并且求这些年龄段的人的平均薪资
GET bank/_search
{
"query": {
"match_all": {}
},
"aggs": {
"ageAgg": {
"terms": {
"field": "age",
"size": 100
},
"aggs": {
"avgAgg": {
"avg": {
"field": "balance"
}
}
}
}
}
}
- 查出所有年龄分布,并且这些年龄段中M的平均薪资和F的平均薪资以及这个年龄段的总体平均薪资
GET bank/_search
{
"query": {
"match_all": {}
},
"aggs": {
"ageAgg": {
"terms": {
"field": "age",
"size": 100
},
"aggs": {
"genderAgg": {
"terms": {
"field": "gender.keyword",
"size": 10
},
"aggs": {
"balancdAvg": {
"avg": {
"field": "balance"
}
}
}
},
"ageBalancdAvg": {
"avg": {
"field": "balance"
}
}
}
}
}
}
3、Mapping
3.1>、字段类型
3.2>、映射
GET /bank/_mapping
PUT /my_index
{
"mappings": {
"properties": {
"age": {"type": "integer"},
"email": {"type": "keyword"},
"name": {"type": "text"}
}
}
}
3.3>、添加新的字段映射
PUT /my-index/_mapping
{
"properties": {
"employee_id": {
"type": "keyword",
"index": false
}
}
}
3.4>、更新映射
对于已经存在的映射字段,我们不能更新。更新必须创建新的索引进行数据迁移
3.5>、数据迁移
先创建出 new_twitter 的正确映射,然后使用如下方式进行数据迁移
POST _reindex [固定写法]
{
"source":{
"index":"twitter"
},
"dest":{
"index":"new_twitter"
}
}
将旧索引的type下的数据进行迁移
POST _reindex
{
"source":{
"index":"twitter,
"type":"tweet"
},
"dest":{
"index":"new_twitter"
}
}
4、分词
4.1>、安装ik分词
cd /mydata/elasticsearch/plugins
wget https://github.com/medcl/elasticsearch-analysis-ik.git
4.2>、自定义扩展词库
- 使用docker安装nginx,见docker实战
- 编写配置文件
/mydata/elasticsearch/plugins/ik/config
vim IKAnalyzer.cfg.xml
<!--用户可以在这里配置远程扩展字典 -->
<entry key="remote_ext_dict">http://192.168.52.10/es/fenci.txt</entry>
docker restart elasticsearch
5、SpringBoot 整合 high-level-client
5.1>、添加依赖
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.4.2</version>
</dependency>
5.2>、修改 SpringBoot 整合的Elasticsearch默认版本
<properties>
<elasticsearch.version>7.4.2</elasticsearch.version>
</properties>
5.3>、添加配置
package com.xgxz.shopmall.search.config;
import org.apache.http.HttpHost;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
public class ElasticsearchConfig {
public static final RequestOptions COMMON_OPTIONS;
static {
RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
COMMON_OPTIONS = builder.build();
}
@Bean
public RestHighLevelClient esRestClient(){
RestHighLevelClient client = new RestHighLevelClient(
RestClient.builder(
new HttpHost("192.168.52.10",9200,"http")
)
);
return client;
}
}