作者:刘晓国
搜索多个索引时,你可以使用 indices_boost 参数来提升一个或多个指定索引的结果。 当来自某些索引的命中比来自其他索引的命中更重要时,这很有用。
注意:你不能对数据流使用 indices_boost。
下面,我来用一个例子来展示如何使用 indices_boost 来针对一些索引进行 boost。
例子
在今天的例子中,我们使用一个 twitter 的索引来进行展示。由于这个索引含有位置信息,所有,我们必须首先定义一个关于这个索引 bookdb_index 的 mapping,这样便于我们在导入数据时,location 是我们正确需要的 geo_point 数据类型:
PUT twitter { "mappings": { "properties": { "location": { "type": "geo_point" } } } }
通过上面的命令,我们就创建了一个叫做 bookdb_index 的索引。我们接着使用 bulk API 来导入我们的数据:
POST _bulk { "index" : { "_index" : "twitter", "_id": 1} } {"user":"双榆树-张三","message":"今儿天气不错啊,出去转转去","uid":2,"age":20,"city":"北京","province":"北京","country":"中国","address":"中国北京市海淀区","location":{"lat":"39.970718","lon":"116.325747"}} { "index" : { "_index" : "twitter", "_id": 2} } {"user":"虹桥-老吴","message":"好友来了都今天我生日,好友来了,什么 birthday happy 就成!","uid":2,"age":90,"city":"上海","province":"上海","country":"中国","address":"中国上海市闵行区","location":{"lat":"31.175927","lon":"121.383328"}} { "index" : { "_index" : "twitter", "_id": 3} } {"user":"东城区-李四","message":"happy birthday!","uid":4,"age":30,"city":"北京","province":"北京","country":"中国","address":"中国北京市东城区","location":{"lat":"39.893801","lon":"116.408986"}}
在上面, 我使用了 3 个索引数据。为了方便,我们使用 reindex API 来把上面的 twitter 索引导入到另外一个叫做 twitter1 的索引中。
PUT twitter1 { "mappings": { "properties": { "location": { "type": "geo_point" } } } }
POST _reindex { "source": { "index": "twitter" }, "dest": { "index": "twitter1" } }
这样 twitter1 里含有和 twitter 一模一样的三个文档。
接着我们,做如下的搜索:
GET twitter*/_search { "indices_boost": [ { "twitter": 10.0 }, { "twitter": 2.0 } ] }
在上面, 我们给 twitter 索引加权 10.0,而对 twitter1 的索引加权为 2.0。上面的搜索结果为:
"hits" : [ { "_index" : "twitter", "_type" : "_doc", "_id" : "1", "_score" : 10.0, "_source" : { "user" : "双榆树-张三", "message" : "今儿天气不错啊,出去转转去", "uid" : 2, "age" : 20, "city" : "北京", "province" : "北京", "country" : "中国", "address" : "中国北京市海淀区", "location" : { "lat" : "39.970718", "lon" : "116.325747" } } }, { "_index" : "twitter", "_type" : "_doc", "_id" : "2", "_score" : 10.0, "_source" : { "user" : "虹桥-老吴", "message" : "好友来了都今天我生日,好友来了,什么 birthday happy 就成!", "uid" : 2, "age" : 90, "city" : "上海", "province" : "上海", "country" : "中国", "address" : "中国上海市闵行区", "location" : { "lat" : "31.175927", "lon" : "121.383328" } } }, { "_index" : "twitter", "_type" : "_doc", "_id" : "3", "_score" : 10.0, "_source" : { "user" : "东城区-李四", "message" : "happy birthday!", "uid" : 4, "age" : 30, "city" : "北京", "province" : "北京", "country" : "中国", "address" : "中国北京市东城区", "location" : { "lat" : "39.893801", "lon" : "116.408986" } } }, { "_index" : "twitter1", "_type" : "_doc", "_id" : "1", "_score" : 1.0, "_source" : { "user" : "双榆树-张三", "message" : "今儿天气不错啊,出去转转去", "uid" : 2, "age" : 20, "city" : "北京", "province" : "北京", "country" : "中国", "address" : "中国北京市海淀区", "location" : { "lat" : "39.970718", "lon" : "116.325747" } } }, { "_index" : "twitter1", "_type" : "_doc", "_id" : "2", "_score" : 1.0, "_source" : { "user" : "虹桥-老吴", "message" : "好友来了都今天我生日,好友来了,什么 birthday happy 就成!", "uid" : 2, "age" : 90, "city" : "上海", "province" : "上海", "country" : "中国", "address" : "中国上海市闵行区", "location" : { "lat" : "31.175927", "lon" : "121.383328" } } }, { "_index" : "twitter1", "_type" : "_doc", "_id" : "3", "_score" : 1.0, "_source" : { "user" : "东城区-李四", "message" : "happy birthday!", "uid" : 4, "age" : 30, "city" : "北京", "province" : "北京", "country" : "中国", "address" : "中国北京市东城区", "location" : { "lat" : "39.893801", "lon" : "116.408986" } } } ]
从上面的结果中,我们可以看出来所有 twitter 中的文档都排在前面,而 twitter1 中的文档排在后面。
另外,也可以使用别名和索引模式。我们来创建如下的别名:
PUT twitter/_alias/city_shanghai { "filter": [ { "term": { "city.keyword": "上海" } } ] }
上面定义了一个叫做 city_shanghai 的别名。我们接下来做如下的搜索:
GET twitter*/_search { "indices_boost": [ { "city_shanghai": 10.0 }, { "twitter1": 2.0 } ], "query": { "match": { "country": "中国" } } }
上面的搜索结果是:
"hits" : [ { "_index" : "twitter", "_type" : "_doc", "_id" : "1", "_score" : 2.6706278, "_source" : { "user" : "双榆树-张三", "message" : "今儿天气不错啊,出去转转去", "uid" : 2, "age" : 20, "city" : "北京", "province" : "北京", "country" : "中国", "address" : "中国北京市海淀区", "location" : { "lat" : "39.970718", "lon" : "116.325747" } } }, { "_index" : "twitter", "_type" : "_doc", "_id" : "2", "_score" : 2.6706278, "_source" : { "user" : "虹桥-老吴", "message" : "好友来了都今天我生日,好友来了,什么 birthday happy 就成!", "uid" : 2, "age" : 90, "city" : "上海", "province" : "上海", "country" : "中国", "address" : "中国上海市闵行区", "location" : { "lat" : "31.175927", "lon" : "121.383328" } } }, { "_index" : "twitter", "_type" : "_doc", "_id" : "3", "_score" : 2.6706278, "_source" : { "user" : "东城区-李四", "message" : "happy birthday!", "uid" : 4, "age" : 30, "city" : "北京", "province" : "北京", "country" : "中国", "address" : "中国北京市东城区", "location" : { "lat" : "39.893801", "lon" : "116.408986" } } }, { "_index" : "twitter1", "_type" : "_doc", "_id" : "1", "_score" : 0.53412557, "_source" : { "user" : "双榆树-张三", "message" : "今儿天气不错啊,出去转转去", "uid" : 2, "age" : 20, "city" : "北京", "province" : "北京", "country" : "中国", "address" : "中国北京市海淀区", "location" : { "lat" : "39.970718", "lon" : "116.325747" } } }, { "_index" : "twitter1", "_type" : "_doc", "_id" : "2", "_score" : 0.53412557, "_source" : { "user" : "虹桥-老吴", "message" : "好友来了都今天我生日,好友来了,什么 birthday happy 就成!", "uid" : 2, "age" : 90, "city" : "上海", "province" : "上海", "country" : "中国", "address" : "中国上海市闵行区", "location" : { "lat" : "31.175927", "lon" : "121.383328" } } }, { "_index" : "twitter1", "_type" : "_doc", "_id" : "3", "_score" : 0.53412557, "_source" : { "user" : "东城区-李四", "message" : "happy birthday!", "uid" : 4, "age" : 30, "city" : "北京", "province" : "北京", "country" : "中国", "address" : "中国北京市东城区", "location" : { "lat" : "39.893801", "lon" : "116.408986" } } } ]
如果找到多个匹配项,将使用第一个匹配项。 例如,如果一个索引包含在 别名 中并且与 twitter* 模式匹配,则应用 10.0 的提升值。