上篇提过query模式除对记录的筛选之外还对符合条件的记录进行了评分,即与条件的相似匹配程度。我们把评分放在后面的博文中讨论,这篇我们只介绍query查询。
查询可以分为绝对值查询和全文查询:绝对值查询是指非text类型字段的查询,全文查询一般指对于text字段的查询。如果需要对text字段进行绝对值查询的话可以用fields在text字段下定义一个keyword字段。text类型字段在建索引时会经分词器处理分解成许多单词,然后在查询时查询目标也会经历分词处理后才逐个单词进行匹配。所以要注意录入的查询条件不一定是最终的查询内容,因为会首先进行分词处理。
我们先看几个绝对值查询例子:
POST /bank/_search { "query" : { "term" : { "state.keyword": "IL" } } } POST /bank/_search { "query" : { "terms" : { "state.keyword": ["IL","WA"] } } } POST /bank/_search { "query" : { "range" : { "age": { "gte" : 20, "lte" : 40 } } } } POST /bank/_search { "query" : { "prefix" : { "address.keyword": "880" } } } POST /bank/_search { "query" : { "wildcard": { "address.keyword": "*Holmes*" } } } POST /bank/_search { "query" : { "regexp": { "address.keyword": ".*Holmes.*" } } }
elastic4s的表达形式如下:
val qTerm = search("bank").query(termQuery("state.keyword","IL")) val qTerms = search("bank").query(termsQuery("state.keyword","IL","WA")) val qRange = search("bank").query(rangeQuery("age").gte(20).lte(40)) val qPrefix = search("bank").query(prefixQuery("address.keyword","880")) val qWildcard = search("bank").query(wildcardQuery("address.keyword","*Holmes*")) val qRegex = search("bank").query(regexQuery("address.keyword",".*Holmes.*"))
全文查询最简单的例子就是match query 了:
POST /bank/_search { "query" : { "match" : { "address" : "holmes"} } } val qMatch = search("bank").query(matchQuery("address","holmes"))
以上是个单字查询示范。多字全文查询如下:
POST /bank/_search { "query" : { "match" : { "address" : "holmes lane"} } } val qMMatch = search("bank").query(matchQuery("address","holmes lane"))
问题出现了:查询结果不但有"880 Holmes Lane"还包括了"685 School Lane",这是因为分词器把"holmes lane" 分解成了"holmes","lane"两个单字,而多字查询默认关系是or,只要包含holmes,lane任何一项都符合条件。我们可以用and关联:
POST /bank/_search { "query" : { "match" : { "address" : { "query": "holmes lane", "operator": "and" } } } } val qMMatchAnd = search("bank").query(matchQuery("address","holmes lane").operator("and"))
现在结果只剩下"880 Holmes Lane"一条了。下面这个query与之有同效:
POST /bank/_search { "query" : { "match" : { "address" : { "query": "holmes lane", "minimum_should_match": "100%" } } } } val qMMatchMin = search("bank").query(matchQuery("address","holmes lane").minimumShouldMatch("100%"))
以上例子都是简单类型的查询,即单语句查询。现实中我们普遍需要用and,or来结合多种条件形成复合式查询。最具代表性的也就是boolQuery了。boolQuery的格式如下:
GET /bank/_search { "query": { "bool": { "must": [ // lastname=duke and gender.keyword = M { "match": { "lastname": "duke" }}, { "term": { "gender.keyword": "M" }} ], "must_not": [ // and firstname.keyword != Jackson and city.keyword != Jackson { "term": { "firstname.keyword": "Jackson"}}, { "term": { "city.keyword": "Brogan" }} ], "should": [ // or email.keyword = *.cn or age >= 80 { "wildcard": { "email.keyword": "*.cn" }}, { "range": { "age": {"gte" : 80}}} ], "filter": [ // filter state.keyword in (IL,WA,TA) and balance >= 100000 { "terms": { "state.keyword": ["IL","WA","TA"] }}, { "range": { "balance": { "gte": 100000 }}} ] } } }
在elastic4s里这样表示:
val qBool = search("bank").query( boolQuery().must( matchQuery("lastname","duke"), termQuery("gender.keyword","M") ).not( termQuery("fistname.keyword","Jackson"), termQuery("city.keyword","Brogan") ).should( termsQuery("state.keyword",Seq("IL","WA","TA")), rangeQuery("balance").gte(100000) ) )
上面例子里的must,must_no,should,filter各段落可以单独或联合形式任意出现在boolQuery里。在任何段落里还可以嵌入boolQuery, 如下:
GET /bank/_search { "query": { "bool": { "must": [ { "match": { "lastname": "duke" }}, { "term": { "gender.keyword": "M" }} ], "must_not": [ { "term": { "firstname.keyword": "Jackson"}}, { "term": { "city.keyword": "Brogan" }} ], "should": [ { "wildcard": { "email.keyword": "*.cn" }}, { "range": { "age": {"gte" : 80}}} ], "filter": [ { "terms": { "state.keyword": ["IL","WA","TA"] }}, { "range": { "balance": { "gte": 100000 }}}, { "bool" : { "should" : [ {"range" : {"balance" :{"gte" : 1000}}} ] } } ] } } }