https://www.elastic.co/guide/en/elasticsearch/reference/current/getting-started-modify-data.html
Modifying Your Data
Elasticsearch provides data manipulation and search capabilities in near real time.
By default, you can expect a one second delay (refresh interval) from the time you index/update/delete your data until the time that it appears in your search results.
This is an important distinction from other platforms like SQL wherein data is immediately available after a transaction is completed.
Indexing/Replacing Documents
We’ve previously seen how we can index a single document. Let’s recall that command again:
PUT /customer/_doc/1?pretty { "name": "John Doe" }
Again, the above will index the specified document into the customer index, with the ID of 1.
If we then executed the above command again with a different (or same) document, Elasticsearch will replace (i.e. reindex) a new document on top of the existing one with the ID of 1:
Updating Documents
In addition to being able to index and replace documents, we can also update documents.
Note though that Elasticsearch does not actually do in-place updates under the hood.
Whenever we do an update, Elasticsearch deletes the old document and then indexes a new document with the update applied to it in one shot.
This example shows how to update our previous document (ID of 1) by changing the name field to "Jane Doe":
POST /customer/_doc/1/_update?pretty { "doc": { "name": "Jane Doe" } }
This example shows how to update our previous document (ID of 1) by changing the name field to "Jane Doe" and at the same time add an age field to it:
POST /customer/_doc/1/_update?pretty { "doc": { "name": "Jane Doe", "age": 20 } }
Updates can also be performed by using simple scripts. This example uses a script to increment the age by 5:
POST /customer/_doc/1/_update?pretty { "script" : "ctx._source.age += 5" }
In the above example, ctx._source
refers to the current source document that is about to be updated.
Elasticsearch provides the ability to update multiple documents given a query condition (like an SQL UPDATE-WHERE
statement). See docs-update-by-query
API
Deleting Documents
Deleting a document is fairly straightforward. This example shows how to delete our previous customer with the ID of 2:
DELETE /customer/_doc/2?pretty
See the _delete_by_query
API to delete all documents matching a specific query.
It is worth noting that it is much more efficient to delete a whole index instead of deleting all documents with the Delete By Query API.
Batch Processing
In addition to being able to index, update, and delete individual documents, Elasticsearch also provides the ability to perform any of the above operations in batches using the _bulk
API.
This functionality is important in that it provides a very efficient mechanism to do multiple operations as fast as possible with as few network roundtrips as possible.
As a quick example, the following call indexes two documents (ID 1 - John Doe and ID 2 - Jane Doe) in one bulk operation:
POST /customer/_doc/_bulk?pretty {"index":{"_id":"1"}} {"name": "John Doe" } {"index":{"_id":"2"}} {"name": "Jane Doe" }
This example updates the first document (ID of 1) and then deletes the second document (ID of 2) in one bulk operation:
POST /customer/_doc/_bulk?pretty {"update":{"_id":"1"}} {"doc": { "name": "John Doe becomes Jane Doe" } } {"delete":{"_id":"2"}}
Note above that for the delete action, there is no corresponding source document after it since deletes only require the ID of the document to be deleted.
The Bulk API does not fail due to failures in one of the actions.
If a single action fails for whatever reason, it will continue to process the remainder of the actions after it.
When the bulk API returns, it will provide a status for each action (in the same order it was sent in) so that you can check if a specific action failed or not.