ElasticSearch 的Bucket Aggregation 桶聚合(包含javaApi)
Global Aggregation
全局聚合,最*的聚合,无法嵌入到其他bucket聚合+
example:
POST /sales/_search?size=0
{
"query" : {
"match" : { "type" : "t-shirt" }
},
"aggs" : {
"all_products" : {
"global" : {},
"aggs" : {
"avg_price" : { "avg" : { "field" : "price" } }
}
},
"t_shirts": { "avg" : { "field" : "price" } }
}
}
result:
{
...
"aggregations" : {
"all_products" : {
"doc_count" : 7,
"avg_price" : {
"value" : 140.71428571428572
}
},
"t_shirts": {
"value" : 128.33333333333334
}
}
}
分析:global 查询了上下文 所有文档的 平均价格。avg_price 是为global 注册的聚合。
javaApi:
\1. 创建聚合请求
AggregationBuilders
.global("agg")
.subAggregation(AggregationBuilders.terms("genders").field("gender"));
2.分析结果
// sr is here your SearchResponse object
Global agg = sr.getAggregations().get("agg");
agg.getDocCount(); // Doc count
Filter Aggregation
定义当前文档集上下文中与指定筛选器匹配的所有文档的单个桶。通常,这将用于将当前聚合上下文缩小到特定的一组文档。
简而言之 ,就是先查,再根据查到的再聚合。
example:
POST /sales/_search?size=0
{
"aggs" : {
"t_shirts" : {
"filter" : { "term": { "type": "t-shirt" } },
"aggs" : {
"avg_price" : { "avg" : { "field" : "price" } }
}
}
}
}
{
...
"aggregations" : {
"t_shirts" : {
"doc_count" : 3,
"avg_price" : { "value" : 128.33333333333334 }
}
}
}
分析:我们计算了所有t恤类型产品的平均价格。
javaapi:
AggregationBuilders
.filter("agg", QueryBuilders.termQuery("gender", "male"));
Filters Aggregation
定义一个多桶聚合,其中每个桶与一个过滤器关联。每个bucket将收集与其关联的过滤器匹配的所有文档。
简而言之,先构建多个过滤器,再根据这些过滤器聚合。应用场景,如果你group by 后面的条件不仅仅是字段,而需要将字段中的各个值,重新分类,定义成不同的类型再group by,可以用这个聚合。
example
PUT /logs/message/_bulk?refresh
{ "index" : { "_id" : 1 } }
{ "body" : "warning: page could not be rendered" }
{ "index" : { "_id" : 2 } }
{ "body" : "authentication error" }
{ "index" : { "_id" : 3 } }
{ "body" : "warning: connection timed out" }
GET logs/_search
{
"size": 0,
"aggs" : {
"messages" : {
"filters" : {
"filters" : {
"errors" : { "match" : { "body" : "error" }},
"warnings" : { "match" : { "body" : "warning" }}
}
}
}
}
}
{
"took": 9,
"timed_out": false,
"_shards": ...,
"hits": ...,
"aggregations": {
"messages": {
"buckets": {
"errors": {
"doc_count": 1
},
"warnings": {
"doc_count": 2
}
}
}
}
}
javaapi:
request:
AggregationBuilder aggregation =
AggregationBuilders
.filters("agg",
new FiltersAggregator.KeyedFilter("men", QueryBuilders.termQuery("gender", "male")),
new FiltersAggregator.KeyedFilter("women", QueryBuilders.termQuery("gender", "female")));
response:
// sr is here your SearchResponse object
Filters agg = sr.getAggregations().get("agg");
// For each entry
for (Filters.Bucket entry : agg.getBuckets()) {
String key = entry.getKeyAsString(); // bucket key
long docCount = entry.getDocCount(); // Doc count
logger.info("key [{}], doc_count [{}]", key, docCount);
}
anonymous filters 匿名
详情见官方文档,就是filter字段可以用过滤器数组,不需要对过滤器命名
other bucket
可以将other_bucket参数设置为向响应添加一个bucket,该响应将包含与任何给定过滤器不匹配的所有文档。
PUT logs/message/4?refresh
{
"body": "info: user Bob logged out"
}
GET logs/_search
{
"size": 0,
"aggs" : {
"messages" : {
"filters" : {
"other_bucket_key": "other_messages",
"filters" : {
"errors" : { "match" : { "body" : "error" }},
"warnings" : { "match" : { "body" : "warning" }}
}
}
}
}
}
{
"took": 3,
"timed_out": false,
"_shards": ...,
"hits": ...,
"aggregations": {
"messages": {
"buckets": {
"errors": {
"doc_count": 1
},
"warnings": {
"doc_count": 2
},
"other_messages": {
"doc_count": 1
}
}
}
}
}
Missing Aggregation 缺省聚合
一个基于字段数据的单桶聚合,它为当前文档集 中缺少字段值 的所有文档创建一个桶(实际上,缺少字段 或具有已配置的 空值集)。此聚合器通常 与其他字段 数据桶聚合器(例如范围)一起使用,以返回由于缺少字段数据值 而无法放在任何其他桶中 的所有文档的信息。简而言之,就是可以将 为空值的数据 放到一个桶中。
POST /sales/_search?size=0
{
"aggs" : {
"products_without_a_price" : {
"missing" : { "field" : "price" }
}
}
}
{
...
"aggregations" : {
"products_without_a_price" : {
"doc_count" : 00
}
}
}
获得没有价格的产品总数,为0很正常,产品怎么会没价格。
javaapi:
AggregationBuilders.missing("agg").field("gender");
分析结果,同上。
Nested Aggregation 嵌套聚合
一种特殊的单桶聚合,支持聚合嵌套文档。
例如,假设我们有一个产品索引,每个产品都包含分销商列表——每个分销商都有自己的产品价格。映射可以是这样的:
PUT /index
{
"mappings": {
"product" : {
"properties" : {
"resellers" : {
"type" : "nested",
"properties" : {
"name" : { "type" : "text" },
"price" : { "type" : "double" }
}
}
}
}
}
}
GET /_search
{
"query" : {
"match" : { "name" : "led tv" }
},
"aggs" : {
"resellers" : {
"nested" : {
"path" : "resellers"
},
"aggs" : {
"min_price" : { "min" : { "field" : "resellers.price" } }
}
}
}
}
正如您在上面看到的,嵌套聚合需要顶层文档中的嵌套文档的路径。然后可以在这些嵌套文档上定义任何类型的聚合。
聚合将返回可以购买产品的最低价格
javaapi:
AggregationBuilders
.nested("agg", "resellers");
分析结果同上。
Reverse Nested Aggregation 反转嵌套聚合
一种特殊的单桶聚合,支持在嵌套文档中 聚合父文档。这种聚合可以有效地 跳出 嵌套块结构,并链接到 其他嵌套结构 或根文档,从而允许 将不属于嵌套对象 的其他聚合 嵌套在 嵌套聚合中。
The reverse_nested
aggregation 必须被定义在 inside a nested
aggregation.
AggregationBuilder aggregation =
AggregationBuilders
.nested("agg", "resellers")
.subAggregation(
AggregationBuilders
.terms("name").field("resellers.name")
.subAggregation(
AggregationBuilders
.reverseNested("reseller_to_product")
)
);
// sr is here your SearchResponse object
Nested agg = sr.getAggregations().get("agg");
Terms name = agg.getAggregations().get("name");
for (Terms.Bucket bucket : name.getBuckets()) {
ReverseNested resellerToProduct = bucket.getAggregations().get("reseller_to_product");
resellerToProduct.getDocCount(); // Doc count
}
Children Aggregation 孩子子聚合
AggregationBuilder aggregation =
AggregationBuilders
.children("agg", "reseller");
Terms Aggregation 术语聚合
AggregationBuilders
.terms("genders")
.field("gender");
ORDER 排序
1.按桶的doc_count升序排序
AggregationBuilders
.terms("genders")
.field("gender")
.order(BucketOrder.count(true))
2.按桶的英文字母顺序,按升序排列:
AggregationBuilders
.terms("genders")
.field("gender")
.order(BucketOrder.key(true))
3.按单值 度量子聚合 (由聚合名称标识)对桶进行排序:
AggregationBuilders
.terms("genders")
.field("gender")
.order(BucketOrder.aggregation("avg_height", false))
.subAggregation(
AggregationBuilders.avg("avg_height").field("height")
)
4.按多个标准排序桶:
AggregationBuilders
.terms("genders")
.field("gender")
.order(BucketOrder.compound( // in order of priority:
BucketOrder.aggregation("avg_height", false), // sort by sub-aggregation first
BucketOrder.count(true))) // then bucket count as a tie-breaker
.subAggregation(
AggregationBuilders.avg("avg_height").field("height")
)
Significant Terms Aggregation 重要术语聚合
AggregationBuilder aggregation =
AggregationBuilders
.significantTerms("significant_countries")
.field("address.country");
// Let say you search for men only
SearchResponse sr = client.prepareSearch()
.setQuery(QueryBuilders.termQuery("gender", "male"))
.addAggregation(aggregation)
.get();
Range Aggregation 范围聚合
AggregationBuilder aggregation =
AggregationBuilders
.range("agg")
.field("height")
.addUnboundedTo(1.0f) // from -infinity to 1.0 (excluded)
.addRange(1.0f, 1.5f) // from 1.0 to 1.5 (excluded)
.addUnboundedFrom(1.5f); // from 1.5 to +infinity
Date Range Aggregation 日期范围聚合
AggregationBuilder aggregation =
AggregationBuilders
.dateRange("agg")
.field("dateOfBirth")
.format("yyyy")
.addUnboundedTo("1950") // from -infinity to 1950 (excluded)
.addRange("1950", "1960") // from 1950 to 1960 (excluded)
.addUnboundedFrom("1960"); // from 1960 to +infinity
Ip Range Aggregation IP范围聚合
AggregatorBuilder<?> aggregation =
AggregationBuilders
.ipRange("agg")
.field("ip")
.addUnboundedTo("192.168.1.0") // from -infinity to 192.168.1.0 (excluded)
.addRange("192.168.1.0", "192.168.2.0") // from 192.168.1.0 to 192.168.2.0 (excluded)
.addUnboundedFrom("192.168.2.0"); // from 192.168.2.0 to +infinity
Histogram Aggregation 直方图聚合
AggregationBuilder aggregation =
AggregationBuilders
.histogram("agg")
.field("height")
.interval(1);
Date Histogram Aggregation 日期直方图句
AggregationBuilder aggregation =
AggregationBuilders
.dateHistogram("agg")
.field("dateOfBirth")
.dateHistogramInterval(DateHistogramInterval.YEAR);
或者你想要设置一个十天的间隔
AggregationBuilder aggregation =
AggregationBuilders
.dateHistogram("agg")
.field("dateOfBirth")
.dateHistogramInterval(DateHistogramInterval.days(10));
Geo Distance Aggregation 地理距离聚合
AggregationBuilder aggregation =
AggregationBuilders
.geoDistance("agg", new GeoPoint(48.84237171118314,2.33320027692004))
.field("address.location")
.unit(DistanceUnit.KILOMETERS)
.addUnboundedTo(3.0)
.addRange(3.0, 10.0)
.addRange(10.0, 500.0);
Geo Hash Grid Aggregation 地理哈希网格聚合
AggregationBuilder aggregation =
AggregationBuilders
.geohashGrid("agg")
.field("address.location")
.precision(4);
####################################
各聚合解释可看官方文档:文档文档
####################################
####################################
文章转载自:https://blog.csdn.net/yusen0/article/details/89277768
####################################