Elasticsearch使用系列-ES增删查改基本操作+ik分词
Elasticsearch使用系列-基本查询和聚合查询+sql插件
一、基本查询
1.And查询must
GET user2/_search { "query": { "bool":{ "must": [ { "match": { "name": "张三" } }, { "match": { "hobby": "钓鱼" } } ] } }, "_source": ["name","age","hobby"], "sort": [ { "age": {"order": "desc"}}, { "name2": {"order": "asc"}} ], "from": 0, "size": 20 }
- bool :And查询属于bool查询,must里面带And的查询条件。
- _source:要查询的字段
- sort:对查询结果排序
- from:分页查询,跳过多少条
- size:分页查询,一页查多少条
2.or查询should
GET user2/_search { "query": { "bool":{ "should": [ { "match": { "name": "张三" } }, { "match": { "hobby": "钓鱼" } } ] } } }
- bool:or查询属于bool查询
- should:里面放or的查询条件
3.排除查询 must_not
#查询名字不等于张三 GET user2/_search { "query": { "bool":{ "must_not": [ { "match": { "name": "张三" } } ] } } }
4.过滤查询filter
#查询名字等于张三,年龄大于等于10小于等于20 GET user2/_search { "query": { "bool":{ "must": [ { "match": { "name": "张三" } } ], "filter": [ {"range": { "age": { "gte": 10, "lte": 20 } }} ] } } }
- filter:过滤条件,先过滤数据再查询结果
- range:范围查询,和term,match是同类的查询。
- gte:大于等于
- gt:大于
- lte:小于等于
- lt:小于
5.同字段多值查询
GET user2/_search { "query": { "terms": { "name2": ["张三","李四"] } } }
#text类型的多值查询,空格隔开 GET user2/_search { "query": { "match": { "name": "张三 李四" } } }
6.高亮查询highlight
高亮查询,就是平时搜索东西时,搜索结果会把你的关键词匹配到的显示颜色,像下图一样。
高亮展示的数据,本身就是文档中的一个field,单独将field以highlight的形式返回给你。
ES提供了一个highlight属性,和query同级别的。
- pre_tag:指定前缀标签,如
<font color="red">
- post_tags:指定后缀标签,如
</font>
- fields:指定那个字段为高亮字段
查出来后,显示hobby字段的地方,就直接用高亮的hobby展示就行了。
二、聚合查询
bucket:分组后统计,类似于Mysql中的group by
metric:对分组统计的结果,计算最大值,最小值,平均值等,类似于Mysql中的max(),min(),avg()函数的值。
1.准备数据
创建索引
PUT employee { "mappings": { "properties": { "id": { "type": "integer" }, "name": { "type": "keyword" }, "job": { "type": "keyword" }, "age": { "type": "integer" }, "gender": { "type": "keyword" } } } }
批量插入数据
PUT employee/_bulk {"index": {"_id": 1}} {"id": 1, "name": "Bob", "job": "java", "age": 21, "sal": 8000, "gender": "male"} {"index": {"_id": 2}} {"id": 2, "name": "Rod", "job": "html", "age": 31, "sal": 18000, "gender": "female"} {"index": {"_id": 3}} {"id": 3, "name": "Gaving", "job": "java", "age": 24, "sal": 12000, "gender": "male"} {"index": {"_id": 4}} {"id": 4, "name": "King", "job": "dba", "age": 26, "sal": 15000, "gender": "female"} {"index": {"_id": 5}} {"id": 5, "name": "Jonhson", "job": "dba", "age": 29, "sal": 16000, "gender": "male"} {"index": {"_id": 6}} {"id": 6, "name": "Douge", "job": "java", "age": 41, "sal": 20000, "gender": "female"} {"index": {"_id": 7}} {"id": 7, "name": "cutting", "job": "dba", "age": 27, "sal": 7000, "gender": "male"} {"index": {"_id": 8}} {"id": 8, "name": "Bona", "job": "html", "age": 22, "sal": 14000, "gender": "female"} {"index": {"_id": 9}} {"id": 9, "name": "Shyon", "job": "dba", "age": 20, "sal": 19000, "gender": "female"} {"index": {"_id": 10}} {"id": 10, "name": "James", "job": "html", "age": 18, "sal": 22000, "gender": "male"} {"index": {"_id": 11}} {"id": 11, "name": "Golsling", "job": "java", "age": 32, "sal": 23000, "gender": "female"} {"index": {"_id": 12}} {"id": 12, "name": "Lily", "job": "java", "age": 24, "sal": 2000, "gender": "male"} {"index": {"_id": 13}} {"id": 13, "name": "Jack", "job": "html", "age": 23, "sal": 3000, "gender": "female"} {"index": {"_id": 14}} {"id": 14, "name": "Rose", "job": "java", "age": 36, "sal": 6000, "gender": "female"} {"index": {"_id": 15}} {"id": 15, "name": "Will", "job": "dba", "age": 38, "sal": 4500, "gender": "male"} {"index": {"_id": 16}} {"id": 16, "name": "smith", "job": "java", "age": 32, "sal": 23000, "gender": "male"}
2.分组统计
查询员工各种语言数量,相当于group by
#查询员工各种语言数量 GET employee/_search { "size": 0, "aggs": { "languge_count": { "terms": { "field": "job" } } } }
- size:0表示只要统计后的结果,原始数据不展现,如果是大于0,则会返回多少条原始数据
- aggs:固定语法
- languge_count:自定义的分组名称,可以随便写
- terms:按什么字段进行分组
- field:具体的字段名称
3.平均值,最大值,最小值,求和统计
GET employee/_search { "size": 0, "aggs": { "language_count": { "terms": { "field": "job" }, "aggs":{ "age_avg":{ "avg":{ "field": "age" } } } } } }
- aggs:固定写法
- age_avg:自定义统计名称,随便写
- avg:平均值,其他有 max:最大值,min:最小值,sum:求和
- fileld:要计算的字段
4.分段统计
#按年龄区间分段统计 GET employee/_search { "size": 0, "aggs": { "language_count": { "histogram": { "field": "age", "interval": 10 }, "aggs":{ "age_avg":{ "sum":{ "field": "age" } } } } } }
- histogram:分段统计
- interval:分段间隔
5.日期分段统计
#按月份统计生日人数 GET employee/_search { "size": 0, "aggs": { "language_count": { "date_histogram": { "field": "borthday", "interval": "month", "format": "yyyy-MM-dd", "min_doc_count": 0, "extended_bounds": { "min": "1970-10-01", "max": "2022-12-31" } } } } }
- date_histogram:日期分段统计函数
- field:聚合分组的字段,类型需要为date
- interval:按什么时间聚合,interval字段支持多种关键字:year, quarter(季度), month, week, day, hour, minute, second,
- format:返回值格式化
- min_doc_count:0分组后没数据的也显示,最小有多少条才显示
- extended_bounds:强制规定最小值和最大值界限,ES默认把有数据的最小值开始做开始界限
6.同时统计多个集合
#分别统计年龄和性别 GET employee/_search { "size": 0, "aggs": { "language_count": { "histogram": { "field": "age", "interval": 10 }, "aggs":{ "age_avg":{ "sum":{ "field": "age" } } } }, "gender_count":{ "terms": { "field": "gender" } } } }
三、sql插件
1.插件安装
上面的查询语句为DSL查询,sql插件可以编写sql语句,然后自动解析为DSL语句查询
sql插件github地址:https://github.com/NLPchina/elasticsearch-sql
下载的对应es的版本。
解压后放到 plugins 文件夹并改名为sql,然后重启es
2.sql语句查询
2.1普通查询
GET /_sql?format=txt { "query": "select * from employee where job='java'" }
2.2其他查询写法
#普通查询 SELECT * FROM bank WHERE age >30 AND gender = 'm'
#聚合查询(分组统计) select COUNT(*),SUM(age),MIN(age) as m, MAX(age),AVG(age) FROM bank GROUP BY gender ORDER BY SUM(age), m DESC
#删除 DELETE FROM bank WHERE age >30 AND gender = 'm'
更多的查询看sql插件的github地址最下面的说明