@[toc]
一、ES备份
1.1 基本概念
Elasticsearch 副本提供了高可靠性;可以容忍部分的节点丢失而不会中断服务。但是副本并不提供对灾难性故障的保护。对这种情况,则是需要备份来进行处理。
备份Elasticsearch集群可以使用 snapshot API。该API会拿到集群里当前的状态和数据然后保存到一个共享仓库里。这个备份过程是"智能"的。第一个快照会是一个数据的完整拷贝,但是所有后续的快照会保留的是已存快照和新数据之间的差异。随着不时的对数据进行快照,备份也在增量的添加和删除。这意味着后续备份会相当快速,因为它们只传输很小的数据量。
每个快照可以包含在各种版本的Elasticsearch中创建的索引,并且在还原快照时,必须有可能将所有索引还原到目标集群中。如果快照中的任何索引是在不兼容的版本中创建的,则将无法还原快照。
- 在6.x中创建的索引快照可以恢复到7.x。
- 在5.x中创建的索引快照可以恢复到6.x。
- 在2.x中创建的索引快照可以恢复到5.x。
- 可以将在1.x中创建的索引快照恢复到2.x。
要使用这个功能,必须先创建一个保存数据的仓库。有多个仓库类型可以选择:
- GCS
- AZURE
- S3
- NAS
- HDFS
- OSS
- 共享块存储
在升级之前备份数据时,如果快照包含在与升级版本不兼容的版本中创建的索引,则升级后将无法还原快照。如果最终遇到需要还原与当前运行的集群版本不兼容的索引快照的情况,则可以将其还原到最新的兼容版本上,然后使用 reindex-from-remote重建索引。
1.2 创建仓库
mkdir -p /backups/my_backup
chown -R es:es /backups/my_backup
部署一个共享文件系统仓库:
--仓库名:my_backup
PUT _snapshot/my_backup20200603/
{
--仓库的类型应该是一个共享文件系统
"type": "fs",
"settings": {
--已挂载设备的地址
"location": "/backups/my_backup/20200603"
}
}
注意:共享文件系统路径必须确保集群所有节点都可以访问到
这步会在挂载点创建仓库和所需的元数据。还有一些其他的配置可能想要配置的,这些取决于节点、网络的性能状况和仓库位置.
//location快照的位置。必选
//compress打开快照文件的压缩。压缩仅应用于元数据文件(索引映射和设置)。数据文件未压缩。默认为true。
//chunk_size如果需要,可以在快照过程中将大文件分解为多个块。指定块尺寸的值,并且单元,例如:1GB,10MB,5KB,500B。默认为null(无限制的块大小)。
//max_restore_bytes_per_sec每个节点的节流恢复速率。默认为40mb每秒。
//max_snapshot_bytes_per_sec每个节点的快照速率限制。默认为40mb每秒。
#POST会更新已有仓库的设置,但是底层数据不会被修改
POST _snapshot/my_backup20200603/
{
"type": "fs",
"settings": {
"location": "/backups/my_backup/20200603",
"max_snapshot_bytes_per_sec" : "50mb",
"max_restore_bytes_per_sec" : "50mb"
}
}
1.3 快照所有打开的索引
一个仓库可以包含多个快照。每个快照跟一系列索引相关(比如所有索引,一部分索引,或者单个索引),快照要有一个唯一的名字。
PUT _snapshot/my_backup20200603/snapshot_1
#这个会备份所有打开的索引到my_backup仓库下一个命名为snapshot_1的快照里。这个调用会立刻返回,然后快照会在后台运行。
通常快照作为后台进程运行,不过有时候要等快照创建完成之后再返回,可以通过添加一个 wait_for_completion 标记实现,但是要注意kibana的超时限制。参数 wait_for_completion 决定请求是在快照初始化后立即返回(默认),还是等快照创建完成之后再返回。快照初始化时,所有之前的快照信息会被加载到内存,所以在一个大的仓库中改请求需要若干秒(甚至分钟)才能返回,即使参数 wait_for_completion 的值设置为 false。
PUT _snapshot/my_backup20200603/snapshot_1?wait_for_completion=true
#这个会阻塞调用直到快照完成。大型快照会花很长时间才返回
1.4 快照指定索引
默认行为是备份所有打开的索引。但是如果需要备份部分索引,可以在快照集群的时候指定备份哪些索引:
PUT _snapshot/my_backup20200604/snapshot_1
{
"indices": "name"
}
1.5 列出快照相关的信息
要获得单个快照的信息,直接对仓库和快照名发起一个 GET 请求:
GET _snapshot/my_backup20200603/snapshot_1
#该请求快照相关的各种信息
要获取一个仓库中所有快照的完整列表,使用 _all 占位符替换掉具体的快照名称:
GET _snapshot/my_backup20200603/_all
{
"snapshots" : [
{
"snapshot" : "snapshot_1",
"uuid" : "z0lgNdUaQyy7_p2SPgZ2AQ",
"version_id" : 7040099,
"version" : "7.4.0",
"indices" : [
".kibana_task_manager_1",
".security-7",
".apm-agent-configuration",
".monitoring-es-7-2020.06.03",
".monitoring-kibana-7-2020.06.03",
"kibana_sample_data_logs",
".kibana_1",
"name",
"age"
],
"include_global_state" : true,
"state" : "SUCCESS",
"start_time" : "2020-06-03T14:52:13.247Z",
"start_time_in_millis" : 1591195933247,
"end_time" : "2020-06-03T14:52:14.256Z",
"end_time_in_millis" : 1591195934256,
"duration_in_millis" : 1009,
"failures" : [ ],
"shards" : {
"total" : 9,
"failed" : 0,
"successful" : 9
}
},
{
"snapshot" : "snapshot_2",
"uuid" : "54KqGBlPQdqKn6UNzvJjcg",
"version_id" : 7040099,
"version" : "7.4.0",
"indices" : [
".kibana_task_manager_1",
".security-7",
".apm-agent-configuration",
".monitoring-es-7-2020.06.03",
".monitoring-kibana-7-2020.06.03",
"kibana_sample_data_logs",
".kibana_1",
"name",
"age"
],
"include_global_state" : true,
"state" : "SUCCESS",
"start_time" : "2020-06-03T14:52:23.285Z",
"start_time_in_millis" : 1591195943285,
"end_time" : "2020-06-03T14:52:23.688Z",
"end_time_in_millis" : 1591195943688,
"duration_in_millis" : 403,
"failures" : [ ],
"shards" : {
"total" : 9,
"failed" : 0,
"successful" : 9
}
},
{
"snapshot" : "snapshot_3",
"uuid" : "wdCKB4WiRPqwY9BO154vHA",
"version_id" : 7040099,
"version" : "7.4.0",
"indices" : [
".kibana_task_manager_1",
".security-7",
".apm-agent-configuration",
".monitoring-es-7-2020.06.03",
".monitoring-kibana-7-2020.06.03",
"kibana_sample_data_logs",
".kibana_1",
"name",
"age"
],
"include_global_state" : true,
"state" : "SUCCESS",
"start_time" : "2020-06-03T14:52:28.704Z",
"start_time_in_millis" : 1591195948704,
"end_time" : "2020-06-03T14:52:28.907Z",
"end_time_in_millis" : 1591195948907,
"duration_in_millis" : 203,
"failures" : [ ],
"shards" : {
"total" : 9,
"failed" : 0,
"successful" : 9
}
}
]
}
1.6 监控快照进度
wait_for_completion 标记提供了一个监控的基础形式,但哪怕只是对一个中等规模的集群做快照恢复的时候,它都真的不够用。
另外两个 API 会给有关快照状态更详细的信息。首先可以给快照 ID 执行一个 GET:
GET _snapshot/my_backup20200603/snapshot_3
如果调用这个命令的时候快照还在进行中,会看到它什么时候开始,运行了多久等等信息。不过要注意,这个 API 用的是快照机制相同的线程池。如果在快照非常大的分片,状态更新的间隔会很大,因为 API 在竞争相同的线程池资源。
更好的方案是拉取 _status API 数据:
GET _snapshot/my_backup20200603/snapshot_3/_status API 立刻返回,然后给出详细的多的统计值输出:
响应包括快照的总体状况,但也包括下钻到每个索引和每个分片的统计值。这个给展示了有关快照进展的非常详细的视图。分片可以在不同的完成状态:
INITIALIZING
分片在检查集群状态看看自己是否可以被快照。这个一般是非常快的。
STARTED
数据正在被传输到仓库。
FINALIZING
数据传输完成;分片现在在发送快照元数据。
DONE
快照完成!
FAILED
快照处理的时候碰到了错误,这个分片/索引/快照不可能完成了。检查日志获取更多信息。
1.7 删除快照
最后,我们需要一个命令来删除所有不再有用的旧快照。这只要对仓库/快照名称发一个简单的 DELETE HTTP 调用:
DELETE _snapshot/my_backup/snapshot_2
用 API 删除快照很重要,而不能用其他机制(比如手动删除,或者用AWS S3上的自动清除工具)。因为快照是增量的,有可能很多快照依赖于过去的段。delete API 知道哪些数据还在被更多近期快照使用,然后会只删除不再被使用的段。
但是,如果做了一次人工文件删除,将会面临备份严重损坏的风险,因为在删除的是可能还在使用中的数据。
1.8 取消快照
要取消一个快照,在他进行中的时候删除快照即可
DELETE _snapshot/my_backup20200603/snapshot_3这个会中断快照进程。然后删除仓库里进行到一半的快照。
1.9 清理仓库
POST _snapshot/my_backup20200603/_cleanup?pretty
从仓库删除任何快照时,将自动执行此端点执行的大多数清理操作。如果定期删除快照,则在大多数情况下,使用此功能将无法节省任何空间或仅节省很少的空间,因此应相应地降低其调用频率。
1.10 删除仓库
删除仓库后,Elasticsearch仅删除对该仓库存储快照的位置的引用。快照本身保持不变。
DELETE _snapshot/my_backup20200604
二、ES恢复
2.1 恢复快照
ES恢复很简单,只要在恢复回集群的快照 ID后面加上 _restore 即可:
POST _snapshot/my_backup/snapshot_1/_restore
默认行为是把这个快照里存有的所有索引都恢复。如果 snapshot_1 包括五个索引,这五个都会被恢复到我们集群里。和 snapshot API 一样,我们也可以选择希望恢复具体哪个索引。
还有附加的选项用来重命名索引。这个选项允许通过模式匹配索引名称,然后通过恢复进程提供一个新名称。如果想在不替换现有数据的前提下,恢复老数据来验证内容,或者做其他处理,这个选项很有用。让我们从快照里恢复单个索引并提供一个替换的名称:
POST /_snapshot/my_backup/snapshot_1/_restore
{
//只恢复 name 索引,忽略快照中存在的其余索引
"indices": "name",
//查找所提供的模式能匹配上的正在恢复的索引
"rename_pattern": "name(.+)",
//重命名
"rename_replacement": "restored_index_$1"
}
这个会恢复 index_1 到集群里,但是重命名成了 restored_index_1 。
和快照类似, restore 命令也会立刻返回,恢复进程会在后台进行。如果更希望 HTTP 调用阻塞直到恢复完成,添加 wait_for_completion 标记:
POST _snapshot/my_backup/snapshot_1/_restore?wait_for_completion=true
2.2 监控恢复操作
从仓库恢复数据借鉴了 Elasticsearch 里已有的现行恢复机制。在内部实现上,从仓库恢复分片和从另一个节点恢复是等价的。
如果想监控恢复的进度,可以使用 recovery API。这是一个通用目的的 API,用来展示集群中移动着的分片状态。
这个 API 可以为在恢复的指定索引单独调用
GET restored_index_3/_recovery
或者查看集群里所有索引,可能包括跟恢复进程无关的其他分片移动:
GET /_recovery/
{
".monitoring-kibana-7-2020.06.03" : {
"shards" : [
{
"id" : 0,
"type" : "PEER",
"stage" : "DONE",
"primary" : false,
"start_time_in_millis" : 1591184352400,
"stop_time_in_millis" : 1591184354191,
"total_time_in_millis" : 1790,
"source" : {
"id" : "YeJHST86S6ei3vN2Y6snfQ",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9301",
"ip" : "192.168.137.11",
"name" : "node1"
},
"target" : {
"id" : "o1cz718RT96ahpjXOZz5kg",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9303",
"ip" : "192.168.137.11",
"name" : "node3"
},
"index" : {
"size" : {
"total_in_bytes" : 0,
"reused_in_bytes" : 0,
"recovered_in_bytes" : 0,
"percent" : "0.0%"
},
"files" : {
"total" : 0,
"reused" : 0,
"recovered" : 0,
"percent" : "0.0%"
},
"total_time_in_millis" : 6,
"source_throttle_time_in_millis" : 0,
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 57,
"total" : 57,
"percent" : "100.0%",
"total_on_start" : -1,
"total_time_in_millis" : 1692
},
"verify_index" : {
"check_index_time_in_millis" : 0,
"total_time_in_millis" : 0
}
},
{
"id" : 0,
"type" : "EXISTING_STORE",
"stage" : "DONE",
"primary" : true,
"start_time_in_millis" : 1591184349472,
"stop_time_in_millis" : 1591184351099,
"total_time_in_millis" : 1626,
"source" : {
"bootstrap_new_history_uuid" : false
},
"target" : {
"id" : "YeJHST86S6ei3vN2Y6snfQ",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9301",
"ip" : "192.168.137.11",
"name" : "node1"
},
"index" : {
"size" : {
"total_in_bytes" : 359017,
"reused_in_bytes" : 359017,
"recovered_in_bytes" : 0,
"percent" : "100.0%"
},
"files" : {
"total" : 21,
"reused" : 21,
"recovered" : 0,
"percent" : "100.0%"
},
"total_time_in_millis" : 12,
"source_throttle_time_in_millis" : 0,
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 57,
"total" : 57,
"percent" : "100.0%",
"total_on_start" : 57,
"total_time_in_millis" : 1577
},
"verify_index" : {
"check_index_time_in_millis" : 0,
"total_time_in_millis" : 0
}
}
]
},
".security-7" : {
"shards" : [
{
"id" : 0,
"type" : "PEER",
"stage" : "DONE",
"primary" : false,
"start_time_in_millis" : 1591184353847,
"stop_time_in_millis" : 1591184354295,
"total_time_in_millis" : 447,
"source" : {
"id" : "fLaYEiq_TrCKNDWoDHs4uw",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9302",
"ip" : "192.168.137.11",
"name" : "node2"
},
"target" : {
"id" : "o1cz718RT96ahpjXOZz5kg",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9303",
"ip" : "192.168.137.11",
"name" : "node3"
},
"index" : {
"size" : {
"total_in_bytes" : 0,
"reused_in_bytes" : 0,
"recovered_in_bytes" : 0,
"percent" : "0.0%"
},
"files" : {
"total" : 0,
"reused" : 0,
"recovered" : 0,
"percent" : "0.0%"
},
"total_time_in_millis" : 2,
"source_throttle_time_in_millis" : 0,
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 0,
"total" : 0,
"percent" : "100.0%",
"total_on_start" : -1,
"total_time_in_millis" : 415
},
"verify_index" : {
"check_index_time_in_millis" : 0,
"total_time_in_millis" : 0
}
},
{
"id" : 0,
"type" : "EXISTING_STORE",
"stage" : "DONE",
"primary" : true,
"start_time_in_millis" : 1591184352286,
"stop_time_in_millis" : 1591184352942,
"total_time_in_millis" : 655,
"source" : {
"bootstrap_new_history_uuid" : false
},
"target" : {
"id" : "fLaYEiq_TrCKNDWoDHs4uw",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9302",
"ip" : "192.168.137.11",
"name" : "node2"
},
"index" : {
"size" : {
"total_in_bytes" : 83272,
"reused_in_bytes" : 83272,
"recovered_in_bytes" : 0,
"percent" : "100.0%"
},
"files" : {
"total" : 38,
"reused" : 38,
"recovered" : 0,
"percent" : "100.0%"
},
"total_time_in_millis" : 5,
"source_throttle_time_in_millis" : 0,
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 0,
"total" : 0,
"percent" : "100.0%",
"total_on_start" : 0,
"total_time_in_millis" : 588
},
"verify_index" : {
"check_index_time_in_millis" : 0,
"total_time_in_millis" : 0
}
}
]
},
".kibana_task_manager_1" : {
"shards" : [
{
"id" : 0,
"type" : "PEER",
"stage" : "DONE",
"primary" : false,
"start_time_in_millis" : 1591184354328,
"stop_time_in_millis" : 1591184355333,
"total_time_in_millis" : 1004,
"source" : {
"id" : "YeJHST86S6ei3vN2Y6snfQ",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9301",
"ip" : "192.168.137.11",
"name" : "node1"
},
"target" : {
"id" : "o1cz718RT96ahpjXOZz5kg",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9303",
"ip" : "192.168.137.11",
"name" : "node3"
},
"index" : {
"size" : {
"total_in_bytes" : 0,
"reused_in_bytes" : 0,
"recovered_in_bytes" : 0,
"percent" : "0.0%"
},
"files" : {
"total" : 0,
"reused" : 0,
"recovered" : 0,
"percent" : "0.0%"
},
"total_time_in_millis" : 1,
"source_throttle_time_in_millis" : 0,
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 0,
"total" : 0,
"percent" : "100.0%",
"total_on_start" : -1,
"total_time_in_millis" : 294
},
"verify_index" : {
"check_index_time_in_millis" : 0,
"total_time_in_millis" : 0
}
},
{
"id" : 0,
"type" : "EXISTING_STORE",
"stage" : "DONE",
"primary" : true,
"start_time_in_millis" : 1591184349520,
"stop_time_in_millis" : 1591184350042,
"total_time_in_millis" : 521,
"source" : {
"bootstrap_new_history_uuid" : false
},
"target" : {
"id" : "YeJHST86S6ei3vN2Y6snfQ",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9301",
"ip" : "192.168.137.11",
"name" : "node1"
},
"index" : {
"size" : {
"total_in_bytes" : 12872,
"reused_in_bytes" : 12872,
"recovered_in_bytes" : 0,
"percent" : "100.0%"
},
"files" : {
"total" : 7,
"reused" : 7,
"recovered" : 0,
"percent" : "100.0%"
},
"total_time_in_millis" : 0,
"source_throttle_time_in_millis" : 0,
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 0,
"total" : 0,
"percent" : "100.0%",
"total_on_start" : 0,
"total_time_in_millis" : 465
},
"verify_index" : {
"check_index_time_in_millis" : 0,
"total_time_in_millis" : 0
}
}
]
},
".monitoring-es-7-2020.06.03" : {
"shards" : [
{
"id" : 0,
"type" : "PEER",
"stage" : "DONE",
"primary" : false,
"start_time_in_millis" : 1591184356727,
"stop_time_in_millis" : 1591184361202,
"total_time_in_millis" : 4474,
"source" : {
"id" : "fLaYEiq_TrCKNDWoDHs4uw",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9302",
"ip" : "192.168.137.11",
"name" : "node2"
},
"target" : {
"id" : "o1cz718RT96ahpjXOZz5kg",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9303",
"ip" : "192.168.137.11",
"name" : "node3"
},
"index" : {
"size" : {
"total_in_bytes" : 0,
"reused_in_bytes" : 0,
"recovered_in_bytes" : 0,
"percent" : "0.0%"
},
"files" : {
"total" : 0,
"reused" : 0,
"recovered" : 0,
"percent" : "0.0%"
},
"total_time_in_millis" : 2,
"source_throttle_time_in_millis" : 0,
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 1697,
"total" : 1697,
"percent" : "100.0%",
"total_on_start" : -1,
"total_time_in_millis" : 4108
},
"verify_index" : {
"check_index_time_in_millis" : 0,
"total_time_in_millis" : 0
}
},
{
"id" : 0,
"type" : "EXISTING_STORE",
"stage" : "DONE",
"primary" : true,
"start_time_in_millis" : 1591184352320,
"stop_time_in_millis" : 1591184356308,
"total_time_in_millis" : 3987,
"source" : {
"bootstrap_new_history_uuid" : false
},
"target" : {
"id" : "fLaYEiq_TrCKNDWoDHs4uw",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9302",
"ip" : "192.168.137.11",
"name" : "node2"
},
"index" : {
"size" : {
"total_in_bytes" : 7584798,
"reused_in_bytes" : 7584798,
"recovered_in_bytes" : 0,
"percent" : "100.0%"
},
"files" : {
"total" : 51,
"reused" : 51,
"recovered" : 0,
"percent" : "100.0%"
},
"total_time_in_millis" : 9,
"source_throttle_time_in_millis" : 0,
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 1696,
"total" : 1696,
"percent" : "100.0%",
"total_on_start" : 1696,
"total_time_in_millis" : 3858
},
"verify_index" : {
"check_index_time_in_millis" : 0,
"total_time_in_millis" : 0
}
}
]
},
".apm-agent-configuration" : {
"shards" : [
{
"id" : 0,
"type" : "PEER",
"stage" : "DONE",
"primary" : false,
"start_time_in_millis" : 1591184353017,
"stop_time_in_millis" : 1591184353862,
"total_time_in_millis" : 845,
"source" : {
"id" : "fLaYEiq_TrCKNDWoDHs4uw",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9302",
"ip" : "192.168.137.11",
"name" : "node2"
},
"target" : {
"id" : "YeJHST86S6ei3vN2Y6snfQ",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9301",
"ip" : "192.168.137.11",
"name" : "node1"
},
"index" : {
"size" : {
"total_in_bytes" : 0,
"reused_in_bytes" : 0,
"recovered_in_bytes" : 0,
"percent" : "0.0%"
},
"files" : {
"total" : 0,
"reused" : 0,
"recovered" : 0,
"percent" : "0.0%"
},
"total_time_in_millis" : 1,
"source_throttle_time_in_millis" : 0,
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 0,
"total" : 0,
"percent" : "100.0%",
"total_on_start" : -1,
"total_time_in_millis" : 674
},
"verify_index" : {
"check_index_time_in_millis" : 0,
"total_time_in_millis" : 0
}
},
{
"id" : 0,
"type" : "EXISTING_STORE",
"stage" : "DONE",
"primary" : true,
"start_time_in_millis" : 1591184352314,
"stop_time_in_millis" : 1591184352691,
"total_time_in_millis" : 377,
"source" : {
"bootstrap_new_history_uuid" : false
},
"target" : {
"id" : "fLaYEiq_TrCKNDWoDHs4uw",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9302",
"ip" : "192.168.137.11",
"name" : "node2"
},
"index" : {
"size" : {
"total_in_bytes" : 283,
"reused_in_bytes" : 283,
"recovered_in_bytes" : 0,
"percent" : "100.0%"
},
"files" : {
"total" : 1,
"reused" : 1,
"recovered" : 0,
"percent" : "100.0%"
},
"total_time_in_millis" : 0,
"source_throttle_time_in_millis" : 0,
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 0,
"total" : 0,
"percent" : "100.0%",
"total_on_start" : 0,
"total_time_in_millis" : 337
},
"verify_index" : {
"check_index_time_in_millis" : 0,
"total_time_in_millis" : 0
}
}
]
},
"name" : {
"shards" : [
{
"id" : 0,
"type" : "EMPTY_STORE",
"stage" : "DONE",
"primary" : true,
"start_time_in_millis" : 1591195879522,
"stop_time_in_millis" : 1591195879541,
"total_time_in_millis" : 18,
"source" : { },
"target" : {
"id" : "YeJHST86S6ei3vN2Y6snfQ",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9301",
"ip" : "192.168.137.11",
"name" : "node1"
},
"index" : {
"size" : {
"total_in_bytes" : 0,
"reused_in_bytes" : 0,
"recovered_in_bytes" : 0,
"percent" : "0.0%"
},
"files" : {
"total" : 0,
"reused" : 0,
"recovered" : 0,
"percent" : "0.0%"
},
"total_time_in_millis" : 6,
"source_throttle_time_in_millis" : 0,
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 0,
"total" : 0,
"percent" : "100.0%",
"total_on_start" : 0,
"total_time_in_millis" : 5
},
"verify_index" : {
"check_index_time_in_millis" : 0,
"total_time_in_millis" : 0
}
},
{
"id" : 0,
"type" : "PEER",
"stage" : "DONE",
"primary" : false,
"start_time_in_millis" : 1591195879565,
"stop_time_in_millis" : 1591195879725,
"total_time_in_millis" : 159,
"source" : {
"id" : "YeJHST86S6ei3vN2Y6snfQ",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9301",
"ip" : "192.168.137.11",
"name" : "node1"
},
"target" : {
"id" : "fLaYEiq_TrCKNDWoDHs4uw",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9302",
"ip" : "192.168.137.11",
"name" : "node2"
},
"index" : {
"size" : {
"total_in_bytes" : 230,
"reused_in_bytes" : 0,
"recovered_in_bytes" : 230,
"percent" : "100.0%"
},
"files" : {
"total" : 1,
"reused" : 0,
"recovered" : 1,
"percent" : "100.0%"
},
"total_time_in_millis" : 43,
"source_throttle_time_in_millis" : 0,
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 1,
"total" : 1,
"percent" : "100.0%",
"total_on_start" : 0,
"total_time_in_millis" : 98
},
"verify_index" : {
"check_index_time_in_millis" : 0,
"total_time_in_millis" : 0
}
}
]
},
"kibana_sample_data_logs" : {
"shards" : [
{
"id" : 0,
"type" : "SNAPSHOT",
"stage" : "DONE",
"primary" : true,
"start_time_in_millis" : 1591186687934,
"stop_time_in_millis" : 1591186688200,
"total_time_in_millis" : 265,
"source" : {
"repository" : "my_backup",
"snapshot" : "snapshot_1",
"version" : "7.4.0",
"index" : "kibana_sample_data_logs",
"restoreUUID" : "25k0feqzTzWa9m6qTcGFTw"
},
"target" : {
"id" : "o1cz718RT96ahpjXOZz5kg",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9303",
"ip" : "192.168.137.11",
"name" : "node3"
},
"index" : {
"size" : {
"total_in_bytes" : 11820808,
"reused_in_bytes" : 0,
"recovered_in_bytes" : 11820808,
"percent" : "100.0%"
},
"files" : {
"total" : 27,
"reused" : 0,
"recovered" : 27,
"percent" : "100.0%"
},
"total_time_in_millis" : 234,
"source_throttle_time_in_millis" : 0,
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 0,
"total" : 0,
"percent" : "100.0%",
"total_on_start" : 0,
"total_time_in_millis" : 13
},
"verify_index" : {
"check_index_time_in_millis" : 0,
"total_time_in_millis" : 0
}
},
{
"id" : 0,
"type" : "PEER",
"stage" : "DONE",
"primary" : false,
"start_time_in_millis" : 1591186688243,
"stop_time_in_millis" : 1591186688639,
"total_time_in_millis" : 396,
"source" : {
"id" : "o1cz718RT96ahpjXOZz5kg",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9303",
"ip" : "192.168.137.11",
"name" : "node3"
},
"target" : {
"id" : "YeJHST86S6ei3vN2Y6snfQ",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9301",
"ip" : "192.168.137.11",
"name" : "node1"
},
"index" : {
"size" : {
"total_in_bytes" : 11820808,
"reused_in_bytes" : 0,
"recovered_in_bytes" : 11820808,
"percent" : "100.0%"
},
"files" : {
"total" : 27,
"reused" : 0,
"recovered" : 27,
"percent" : "100.0%"
},
"total_time_in_millis" : 348,
"source_throttle_time_in_millis" : 190,
"target_throttle_time_in_millis" : 102
},
"translog" : {
"recovered" : 0,
"total" : 0,
"percent" : "100.0%",
"total_on_start" : 0,
"total_time_in_millis" : 35
},
"verify_index" : {
"check_index_time_in_millis" : 0,
"total_time_in_millis" : 0
}
}
]
},
".kibana_1" : {
"shards" : [
{
"id" : 0,
"type" : "EXISTING_STORE",
"stage" : "DONE",
"primary" : true,
"start_time_in_millis" : 1591184349501,
"stop_time_in_millis" : 1591184350141,
"total_time_in_millis" : 640,
"source" : {
"bootstrap_new_history_uuid" : false
},
"target" : {
"id" : "YeJHST86S6ei3vN2Y6snfQ",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9301",
"ip" : "192.168.137.11",
"name" : "node1"
},
"index" : {
"size" : {
"total_in_bytes" : 22482,
"reused_in_bytes" : 22482,
"recovered_in_bytes" : 0,
"percent" : "100.0%"
},
"files" : {
"total" : 16,
"reused" : 16,
"recovered" : 0,
"percent" : "100.0%"
},
"total_time_in_millis" : 2,
"source_throttle_time_in_millis" : 0,
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 0,
"total" : 0,
"percent" : "100.0%",
"total_on_start" : 0,
"total_time_in_millis" : 579
},
"verify_index" : {
"check_index_time_in_millis" : 0,
"total_time_in_millis" : 0
}
},
{
"id" : 0,
"type" : "PEER",
"stage" : "DONE",
"primary" : false,
"start_time_in_millis" : 1591184353648,
"stop_time_in_millis" : 1591184354331,
"total_time_in_millis" : 682,
"source" : {
"id" : "YeJHST86S6ei3vN2Y6snfQ",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9301",
"ip" : "192.168.137.11",
"name" : "node1"
},
"target" : {
"id" : "fLaYEiq_TrCKNDWoDHs4uw",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9302",
"ip" : "192.168.137.11",
"name" : "node2"
},
"index" : {
"size" : {
"total_in_bytes" : 0,
"reused_in_bytes" : 0,
"recovered_in_bytes" : 0,
"percent" : "0.0%"
},
"files" : {
"total" : 0,
"reused" : 0,
"recovered" : 0,
"percent" : "0.0%"
},
"total_time_in_millis" : 3,
"source_throttle_time_in_millis" : 0,
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 0,
"total" : 0,
"percent" : "100.0%",
"total_on_start" : -1,
"total_time_in_millis" : 605
},
"verify_index" : {
"check_index_time_in_millis" : 0,
"total_time_in_millis" : 0
}
}
]
},
"age" : {
"shards" : [
{
"id" : 0,
"type" : "PEER",
"stage" : "DONE",
"primary" : false,
"start_time_in_millis" : 1591195876756,
"stop_time_in_millis" : 1591195876875,
"total_time_in_millis" : 119,
"source" : {
"id" : "fLaYEiq_TrCKNDWoDHs4uw",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9302",
"ip" : "192.168.137.11",
"name" : "node2"
},
"target" : {
"id" : "o1cz718RT96ahpjXOZz5kg",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9303",
"ip" : "192.168.137.11",
"name" : "node3"
},
"index" : {
"size" : {
"total_in_bytes" : 230,
"reused_in_bytes" : 0,
"recovered_in_bytes" : 230,
"percent" : "100.0%"
},
"files" : {
"total" : 1,
"reused" : 0,
"recovered" : 1,
"percent" : "100.0%"
},
"total_time_in_millis" : 25,
"source_throttle_time_in_millis" : 0,
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 1,
"total" : 1,
"percent" : "100.0%",
"total_on_start" : 0,
"total_time_in_millis" : 80
},
"verify_index" : {
"check_index_time_in_millis" : 0,
"total_time_in_millis" : 0
}
},
{
"id" : 0,
"type" : "EMPTY_STORE",
"stage" : "DONE",
"primary" : true,
"start_time_in_millis" : 1591195876703,
"stop_time_in_millis" : 1591195876724,
"total_time_in_millis" : 21,
"source" : { },
"target" : {
"id" : "fLaYEiq_TrCKNDWoDHs4uw",
"host" : "192.168.137.11",
"transport_address" : "192.168.137.11:9302",
"ip" : "192.168.137.11",
"name" : "node2"
},
"index" : {
"size" : {
"total_in_bytes" : 0,
"reused_in_bytes" : 0,
"recovered_in_bytes" : 0,
"percent" : "0.0%"
},
"files" : {
"total" : 0,
"reused" : 0,
"recovered" : 0,
"percent" : "0.0%"
},
"total_time_in_millis" : 8,
"source_throttle_time_in_millis" : 0,
"target_throttle_time_in_millis" : 0
},
"translog" : {
"recovered" : 0,
"total" : 0,
"percent" : "100.0%",
"total_on_start" : 0,
"total_time_in_millis" : 8
},
"verify_index" : {
"check_index_time_in_millis" : 0,
"total_time_in_millis" : 0
}
}
]
}
}
输出会跟这个类似(注意,根据集群的活跃度,输出可能会非常多):
- type 字段告诉恢复的本质;这个分片是在从一个快照恢复。
- source 哈希描述了作为恢复来源的特定快照和仓库。
- percent 字段让对恢复的状态有个概念。
输出会列出所有目前正在经历恢复的索引,然后列出这些索引里的所有分片。每个分片里会有启动/停止时间、持续时间、恢复百分比、传输字节数等统计值。
2.3 取消恢复
要取消一个恢复,需要删除正在恢复的索引。因为恢复进程其实就是分片恢复,发送一个 删除索引 API 修改集群状态,就可以停止恢复进程。比如:
DELETE /restored_name
如果 restored_name 正在恢复中,这个删除命令会停止恢复,同时删除所有已经恢复到集群里的数据。
三、集群块对备份和恢复操作的影响
许多备份和恢复操作受集群和索引块的影响。例如,创建和删除仓库需要写入全局元数据访问。备份操作要求所有索引及其元数据以及全局元数据均可读。恢复操作要求全局元数据可写,但是在恢复过程中会忽略索引级别块,因为索引实际上是在还原过程中重新创建的。仓库内容不是集群的一部分,因此集群块不会影响内部仓库操作。
四、附录
//////////////////////备份/////////////////////////
//创建/更新仓库配置my_backup20200603
POST _snapshot/my_backup20200603/
{
"type": "fs",
"settings": {
"location": "/backups/my_backup/20200603",
"max_snapshot_bytes_per_sec" : "50mb",
"max_restore_bytes_per_sec" : "50mb"
}
}
GET _cat/tasks?v
//生成测试数据
GET /_cat/indices?v
GET /_cat/indices/name,age?v
POST age/_doc
{
"age" : 1
}
POST name/_doc
{
"user" : "Mike"
}
GET name/_search
GET age/_search
GET movies/_search
GET movies/_mappings
GET movies/_settings
//快照所有打开的索引
PUT _snapshot/my_backup20200603/snapshot_1
PUT _snapshot/my_backup20200603/snapshot_2
PUT _snapshot/my_backup20200603/snapshot_3
//创建/更新仓库配置my_backup20200604
POST _snapshot/my_backup20200604/
{
"type": "fs",
"settings": {
"location": "/backups/my_backup/20200604",
"max_snapshot_bytes_per_sec" : "50mb",
"max_restore_bytes_per_sec" : "50mb"
}
}
//快照指定索引
PUT _snapshot/my_backup20200604/snapshot_1
{
"indices": "name"
}
//列出快照相关信息
GET _snapshot/my_backup20200603/snapshot_1
GET _snapshot/my_backup20200603/_all
// 查看快照进度
GET _snapshot/my_backup20200604/snapshot_1
GET _snapshot/my_backup20200604/snapshot_1/_status
//删除快照/取消快照
DELETE _snapshot/my_backup20200603/snapshot_1
DELETE name
DELETE age
//删除仓库
DELETE _snapshot/my_backup20200603
//清理仓库
POST _snapshot/my_backup20200603/_cleanup?pretty
//////////////////////恢复/////////////////////////
//恢复快照
POST _snapshot/my_backup20200603/snapshot_1/_restore
POST /_snapshot/my_backup20200603/snapshot_3/_restore
{
"indices": "name,age",
"rename_pattern": "(.+)",
"rename_replacement": "$1"
}
GET restored_index_name/_search
//查看恢复状态
GET restored_index_name/_recovery
GET /_recovery/
//取消恢复
DELETE /restored_index_name