【Elasticsearch】-聚合查询

2022-06-17 04:08:34

ES作为搜索引擎兼数据库，同样提供了强大的聚合分析能力。它基于查询条件来对数据进行分桶、计算的方法。有点类似于 SQL 中的 group by 再加一些函数方法的操作。聚合可以嵌套，由此可以组成复杂的操作（Bucketing聚合可以包含sub-aggregation）。

聚合计算的值可以取字段的值，也可是脚本计算的结果。查询请求体中以aggregations节点的语法定义：

"aggregations" : { //也可简写为 aggs

"" : { //聚合的名字

"" : { //聚合的类型

//聚合体：对哪些字段进行聚合

}

[,"meta" : { [] } ]? //元

[,"aggregations" : { []+ } ]? //在聚合里面在定义子聚合

}

[,"" : { ... } ]* //聚合的名字

}

单值输出

ES中大多数的数学计算只能输出一个值，比如max、min、avg、sum、cardinality

GET indexName/_search

{

"aggs"{

"aggs_name":{ #聚合分析的名字有用户自定义

"aggs_type":{

}

多值输出

stats

查询员工工资信息，包括数量、最大值、最小值、平均值、和。

要求字段属性必须是数值类型

GET employee/_search

{

"size": 0,

"aggs": {

"sal_info": {

"stats": {

"field": "sal"

}

terms

查询到达不同国家的航班数量，对数据进行分组并统计出每一组总数。

以下api是按照目的地进行分组，没统计到达每个国家航班的数量

GET kibana_sample_data_flights/_search

{

"size": 0,

"aggs": {

"aaaa": {

"terms": {

"field": "DestCountry",

"size": 10

}

top_hits

年龄最大的两个员工

GET employee/_search

{

"size": 0,

"aggs": {

"max_two_age": {

"top_hits": {

"size": 2

, "sort": [{

"age": {"order": "desc"}

}]

}

range

请注意，此聚合包括每个范围的from值，不包括to值。

例如查询不同工资区间员工工资的统计信息

GET employee/_search

{

"size": 0,

"aggs": {

"range_sal_info": {

"range": {

"field": "sal"

, "ranges": [

{"key":"0 <= sal < 10001",

"to": 10001

{"key": "10001<= sal <20001",

"from": 10001,

"to": 20001

{"key": "20001<= sal <30001",

"from": 20001,

"to": 30001

}

]

}

histogram

以某个固定值为区间查询某个范围内的聚合

比如以5000为固定值，查询工资区间的信息

GET employee/_search

{

"size": 0,

"aggs": {

"range_sal_info": {

"histogram": {

"field": "sal",

"interval": 5000,

"extended_bounds": {

"min": 5000,

"max": 20000

}

min_bucket

A sibling pipeline aggregation which identifies the bucket(s) with the minimum value of a specified metric in a sibling aggregation and outputs both the value and the key(s) of the bucket(s). The specified metric must be numeric and the sibling aggregation must be a multi-bucket aggregation.

分组后最小的桶.

查询平均工资最低的工种

GET employee/_search

{"size": 0,

"aggs": {

"job_info": {

"terms": {

"field": "job"

"aggs": {

"diff_job_avg_sal": {

"avg": {

"field": "sal"

}

"min_avg_job_info":{

"min_bucket":{

"buckets_path":"job_info>diff_job_avg_sal"

}

全局过滤

查询年龄大于30岁的员工平均工资

GET employee/_search

{

"size": 0,

"query": {

"range": {

"age": {

"gt": 30

}

"aggs": {

"gt_30_avg_info": {

"avg": {

"field": "sal"

}

查询岗位为JAVA开发的平均工资，使用constant_score，不计算相关性分数

GET employee/_search

{

"size": 0,

"query": {

"constant_score": {

"filter": {

"term": {

"job": "java"

}

"aggs": {

"gt_30_avg_info": {

"avg": {

"field": "sal"

}

局部过滤

查询所有员工平均工资和年龄大于30岁的平均工资

GET employee/_search

{

"size": 0,

"aggs": {

"all_emp_agv_sal": {

"avg": {

"field": "sal"

}

"gt_30_emp_avg_info":{

"filter": {

"range": {

"age": {

"gt": 30

}

, "aggs": {

"NAME": {

"avg": {

"field": "sal"

}

自动补全

completion

自动补全是我们在日常开发过程中最常见的搜索方式，比如百度搜索、淘宝的商品搜索。

es是通过特殊的数据结构实现，并不是通过传统意义的倒排索引实现。

前缀搜索API，其中size，限制返回多少条数据，skip_duplicates是自动去重。

GET movies/_search

{

"suggest": {

"YOUR_SUGGESTION": {

"prefix" : "beauty",

"completion" : {

"field" : "title",

"size":"20",

"skip_duplicates":"true"

}

高亮显示

将搜索结果进行高亮显示。

# 将title和genre中所有的romance进行高亮显示，通过pre_tages和post_tags定制前缀标签和后缀标签。

GET movies/_search

{

"query": {

"multi_match": {

"query": "romance",

"fields": ["title","genre"]

}

"highlight": {

"pre_tags": "",

"post_tags": "",

"fields": {

"title": {},

"genre": {}

}

GET movies/_search

{

"query": {

"bool": {

"must": [

{"match": {

"year": 2012

}

{"match":{

"title": "Avengers, The"

}

]

}

"highlight": {

"fields": {

"year": {},

"genre": {

"pre_tags": "",

"post_tags": "",

"highlight_query": {

"match": {

"genre": "Action"

}

码农公寓

【Elasticsearch】-聚合查询

单值输出

多值输出

stats

terms

top_hits

range

histogram

min_bucket

全局过滤

局部过滤

推荐搜索

自动补全

高亮显示

码农公寓

单值输出

多值输出

stats

terms

top_hits

range

histogram

min_bucket

全局过滤

局部过滤

推荐搜索

自动补全

高亮显示

相关文章