【ElasticSearch(九)进阶】Term精确数值查询,match keyword精确文本查询
一、Term精确数值查询
-
term查询,会返回那些 在提供的字段中包含确切信息 的文档内容。
-
查询text字段值,使用match。查询精确数值,使用term。
-
为什么避免使用term对text字段进行查询?
默认情况下,ES更改text字段的值作为词法分析的一部分。这会使查找text字段值的精确匹配变得困难。
查询年龄是33岁的数据:
GET bank/_search
{
"query":{
"term":{
"age": 33
}
}
}
返回结果:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 50,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "18",
"_score" : 1.0,
"_source" : {
"account_number" : 18,
"balance" : 4180,
"firstname" : "Dale",
"lastname" : "Adams",
"age" : 33,
"gender" : "M",
"address" : "467 Hutchinson Court",
"employer" : "Boink",
"email" : "daleadams@boink.com",
"city" : "Orick",
"state" : "MD"
}
},
。。。
]
}
}
二、match keyword精确文本查询
下面对比下match
,match_phrase
,match中的属性加.keyword
的区别
1.match 模糊查询文本
会将address
的文本拆分成词,只要结果中包含有任意词的文档,都可以被筛选出来。
GET bank/_search
{
"query":{
"match":{
"address": "467 Hutchinson Court"
}
}
}
返回结果:
{
"took" : 17,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 120,
"relation" : "eq"
},
"max_score" : 14.617203,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "18",
"_score" : 14.617203,
"_source" : {
"account_number" : 18,
"balance" : 4180,
"firstname" : "Dale",
"lastname" : "Adams",
"age" : 33,
"gender" : "M",
"address" : "467 Hutchinson Court",
"employer" : "Boink",
"email" : "daleadams@boink.com",
"city" : "Orick",
"state" : "MD"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "53",
"_score" : 5.990829,
"_source" : {
"account_number" : 53,
"balance" : 28101,
"firstname" : "Kathryn",
"lastname" : "Payne",
"age" : 29,
"gender" : "F",
"address" : "467 Louis Place",
"employer" : "Katakana",
"email" : "kathrynpayne@katakana.com",
"city" : "Harviell",
"state" : "SD"
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "56",
"_score" : 2.1248586,
"_source" : {
"account_number" : 56,
"balance" : 14992,
"firstname" : "Josie",
"lastname" : "Nelson",
"age" : 32,
"gender" : "M",
"address" : "857 Tabor Court",
"employer" : "Emtrac",
"email" : "josienelson@emtrac.com",
"city" : "Sunnyside",
"state" : "UT"
}
},
。。。
]
}
}
2.match_phrase 短语查询
将address
的文本视为一个短语,不进行文本拆分,只要结果中包含这个短语的文档,都能被筛选出来。
GET bank/_search
{
"query":{
"match_phrase":{
"address": "467 Hutchinson"
}
}
}
返回结果:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 12.492344,
"hits" : [
{
"_index" : "bank",
"_type" : "account",
"_id" : "18",
"_score" : 12.492344,
"_source" : {
"account_number" : 18,
"balance" : 4180,
"firstname" : "Dale",
"lastname" : "Adams",
"age" : 33,
"gender" : "M",
"address" : "467 Hutchinson Court",
"employer" : "Boink",
"email" : "daleadams@boink.com",
"city" : "Orick",
"state" : "MD"
}
}
]
}
}
3.match中的属性加.keyword
keyword精确查询,只有结果中address
属性和address.keyword
的值完全一致的文档,才能被筛选出来。
结合2和3,可以发现同样的值,match_phrase 和 keyword的区别。
GET bank/_search
{
"query":{
"match":{
"address.keyword": "467 Hutchinson"
}
}
}
返回结果:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
*我们一般规定:全文检索字段用 match,其他非text字段匹配用term