02- ElasticSearch（简称ES）- 文档批量操作

2024-01-04 11:49:28

单个文档操作（增、改、删）

# ===================
# 1、创建索引
PUT es_document_db

# 2、PUT 新增文档
PUT /es_document_db/_doc/1
{
  "name":"张三1",
  "age":1,
  "birthday":"2021-10-11",
  "address":"中国上海长宁1"
}


# 3、POST 新增文档
POST /es_document_db/_doc
{
  "name":"张三a",
  "age":1,
  "birthday":"2021-10-11",
  "address":"中国上海长宁a"
}


# 4、获取索引下的所有文档 _search
GET /es_document_db/_doc/_search

# 执行的结果为：
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "es_document_db",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "张三1",
          "age" : 1,
          "birthday" : "2021-10-11",
          "address" : "中国上海长宁1"
        }
      },
      {
        "_index" : "es_document_db",
        "_type" : "_doc",
        "_id" : "2uRil3wBDCx46Ou3gAno",
        "_score" : 1.0,
        "_source" : {
          "name" : "张三a",
          "age" : 1,
          "birthday" : "2021-10-11",
          "address" : "中国上海长宁a"
        }
      }
    ]
  }
}





# 5、POST 更新id=1的文档
POST /es_document_db/_doc/1
{
  "name":"张三a",
  "age":1
}

# 6、删除id=1的文档
DELETE /es_document_db/_doc/1

说明：POST和PUT都能起到创建/更新的作用

1、需要注意的是==PUT==需要对一个具体的资源进行操作也就是要确定id才能进行==更新/创建，而==POST==是可以针对整个资源集合进行操作的，如果不写id就由ES生成一个唯一id进行==创建==新文档，如果填了id那就针对这个id的文档进行创建/更新

2、PUT/POST 会将json数据都进行替换

3、PUT与DELETE都是幂等性操作, 即不论操作多少次, 结果都一样

批量文档操作（增、改、删）

# 新建两个索引（es_index_one /es_index_two），用来做后面的文档搜索功能
PUT es_index_one

PUT es_index_two

批量对文档进行写操作是通过_bulk来实现的

请求方式：POST
请求地址：_bulk
请求参数：通过_bulk操作文档，一般至少有两行参数(或偶数行参数)

第一行参数为指定操作的类型及操作的对象(index,type和id)

第二行参数才是操作的数据

{"actionName":{"_index":"indexName", "_type":"typeName","_id":"id"}}

{"field1":"value1", "field2":"value2"}

actionName：表示操作类型，主要有create,index,delete和update

# 2、分别对两个索引进行如下的操作

# 2.1、批量创建文档 (create)
POST _bulk
{"create":{"_index":"es_index_one","_type":"_doc","_id":1}}
{"name":"张三one-1","age":13,"birthday":"2021-10-13","address":"中国上海长宁one-1"}
{"create":{"_index":"es_index_one","_type":"_doc","_id":2}}
{"name":"张三one-2","age":23,"birthday":"2021-10-23","address":"中国上海长宁one-2"}
{"create":{"_index":"es_index_one","_type":"_doc","_id":3}}
{"name":"张三one-3","age":33,"birthday":"2021-10-30","address":"中国上海长宁one-3"}


POST _bulk
{"create":{"_index":"es_index_two","_type":"_doc","_id":1}}
{"name":"张三two-1","age":13,"birthday":"2021-10-13","address":"中国上海长宁two-1"}
{"create":{"_index":"es_index_two","_type":"_doc","_id":2}}
{"name":"张三two-2","age":23,"birthday":"2021-10-23","address":"中国上海长宁two-2"}
{"create":{"_index":"es_index_two","_type":"_doc","_id":3}}
{"name":"张三two-3","age":33,"birthday":"2021-10-30","address":"中国上海长宁two-3"}

# 2.2、批量创建或者更新文档 (index)
POST _bulk
{"index":{"_index":"es_index_one","_type":"_doc","_id":1}}
{"name":"张三one-11","age":13,"birthday":"2021-10-13","address":"中国上海长宁one-1"}
{"index":{"_index":"es_index_one","_type":"_doc","_id":2}}
{"name":"张三one-21","age":23,"birthday":"2021-10-23","address":"中国上海长宁one-2"}
{"index":{"_index":"es_index_one","_type":"_doc","_id":3}}
{"name":"张三one-31","age":33,"birthday":"2021-10-30","address":"中国上海长宁one-3"}

create 和index的区别：如果数据存在，使用create操作失败，会提示文档已经存在，使用index则可以成功执行。

# 2.3、批量修改文档 （update）
POST _bulk
{"update":{"_index":"es_index_one","_type":"_doc","_id":1}}
{"doc":{"name":"张三one-11q","age":131,"address":"中国重庆one-001"}}
{"update":{"_index":"es_index_one","_type":"_doc","_id":2}}
{"doc":{"name":"张三one-21q","age":231,"sex":"男"}}
{"update":{"_index":"es_index_one","_type":"_doc","_id":3}}
{"doc":{"name":"张三one-31q","age":331}}

update：如果指定的_id不存在，则那一条更新会报错，其他的数据不会报错，如果更新的字段存在于已有的文档中，则进行更新，如果加入了新的字段也一样的会加入到文档中，update 并不会将在update中不存在的字段进行删除。

# 2.4、批量删除文档 （delete）
POST _bulk
{"delete":{"_index":"es_index_one","_type":"_doc","_id":1}}
{"delete":{"_index":"es_index_one","_type":"_doc","_id":2}}
{"delete":{"_index":"es_index_one","_type":"_doc","_id":3}}

_bulk 一次最大处理多少数据量？

bulk会把将要处理的数据载入内存中，所以数据量是有限制的，最佳的数据量不是一个确定的数值，它取决于你的硬件，你的文档大小以及复杂性，你的索引以及搜索的负载。

一般建议是1000-5000个文档，如果你的文档很大，可以适当减少队列，大小建议是5-15MB，默认不能超过100M，可以在es的配置文件（即$ES_HOME下的config下的elasticsearch.yml）中。

文档批量获取

批量获取文档数据是通过_mget 来实现的

请求方式：GET
请求地址：_mget
功能说明：可以通过ID批量获取不同index和type的数据
请求参数：

docs : 文档数组参数

_index : 指定index

_source : 指定要查询的字段

_id : 指定id_type : 指定type

# 首先将之前进行批量删除数据加回去

# 3.1、 获取指定索引下所有的文档数据
GET /es_index_one/_doc/_search

GET /es_index_two/_doc/_search





# 3.2、通过 _mget 批量获取
GET _mget
{
  "docs": [
    {
      "_index": "es_index_one",
      "_type": "_doc",
      "_id": 1
    },
    {
      "_index": "es_index_one",
      "_type": "_doc",
      "_id": 2,
      "_source":[
        "name","age"
      ]
    },
    {
      "_index": "es_index_two",
      "_type": "_doc",
      "_id": 5
    }
  ]
}

# 得到的响应是：
{
  "docs" : [
    {
      "_index" : "es_index_one",
      "_type" : "_doc",
      "_id" : "1",
      "_version" : 1,
      "_seq_no" : 14,
      "_primary_term" : 1,
      "found" : true,
      "_source" : {
        "name" : "张三one-1",
        "age" : 13,
        "birthday" : "2021-10-13",
        "address" : "中国上海长宁one-1"
      }
    },
    {
      "_index" : "es_index_one",
      "_type" : "_doc",
      "_id" : "2",
      "_version" : 1,
      "_seq_no" : 15,
      "_primary_term" : 1,
      "found" : true,
      "_source" : {
        "name" : "张三one-2",
        "age" : 23
      }
    },
    {
      "_index" : "es_index_two",
      "_type" : "_doc",
      "_id" : "5",
      "found" : false
    }
  ]
}




# 3.3、在 URL中指定 index
GET /es_index_one/_mget
{
  "docs": [
    {
      "_type": "_doc",
      "_id": 1
    },
    {
      "_type": "_doc",
      "_id": 5
    }
  ]
}

# 3.4、在 URL中指定 index  与 type
GET /es_index_one/_doc/_mget
{
  "docs": [
    {
      "_id": 1
    },
    {
      "_id": 5
    }
  ]
}

码农公寓

单个文档操作（增、改、删）

批量文档操作（增、改、删）

文档批量获取

相关文章