本篇分3部分说一下ElasticSearch的动态映射策略。
1、dynamic陌生字段处理配置
添加文档时,如果遇到了mapping里面没有定义的字段会怎样呢?这就依赖创建mapping时的dynamic配置情况,该属性参数有3种配置值:
1)true:遇到陌生字段就自动映射,ElasticSearch的默认值
2)false:遇到陌生字段就忽略
3)strict:遇到陌生字段就报错
下面进行演示说明:
PUT /lib { "settings":{ "number_of_shards":3, "number_of_replicas":0 }, "mappings":{ "user":{ "dynamic":"strict", "properties":{ "name":{"type":"text"}, "address":{ "type":"object", "dynamic":true } } } } }
类型user中定义了dynamic,其值为strict,也就是说如果添加的文档里出现了name和address以外的字段就会报错。address中又定义了dynamic,值为true,意思是说如果address里面出现了其它字段,es会对其自动映射(因为address是object类型,可以包含其它字段)。dynamic写的位置不动,管的范围不一样。
添加示例,正常不报错
put /lib/user/1 { "name":"lisi", "address":{ "province":"beijing", "city":"beijing" } }
添加结果
{ "_index": "lib", "_type": "user", "_id": "1", "_version": 1, "result": "created", "_shards": { "total": 1, "successful": 1, "failed": 0 }, "_seq_no": 0, "_primary_term": 1 }
添加示例,报错
put /lib/user/2 { "name":"lisi", "age":23, "address":{ "province":"beijing", "city":"beijing" } }
添加结果:
{ "error": { "root_cause": [ { "type": "strict_dynamic_mapping_exception", "reason": "mapping set to strict, dynamic introduction of [age] within [user] is not allowed" } ], "type": "strict_dynamic_mapping_exception", "reason": "mapping set to strict, dynamic introduction of [age] within [user] is not allowed" }, "status": 400 }
2、date_detection默认日期格式的识别为date类型配置
date_detection默认值是true,默认会按照一定格式识别date,比如yyyy-MM-dd会默认识别为日期类型
如下面的1990-12-12没有指明字段类型,会根据格式默认为日期类型
hput /lib/user/1 { "name":"1990-12-12", "address":{ "province":"beijing", "city":"beijing" } }
假如希望写成了日期的格式,也不会识别为date的类型,应该设置为false,如下:
PUT /lib { "settings":{ "number_of_shards":3, "number_of_replicas":0 }, "mappings":{ "user":{ "date_detection":false } } }
3、dymamic mapping template(type) 定制动态映射模板配置
请看下面的例子,定义了索引my_index,类型my_type,在mapping中指明了类型和分词器,但是没有指明字段。但是定义了match和match_mapping_type,意思是:如果字段的名字以_en结尾就使用下面的模板;如果没有匹配就不使用该模板。配置_en的是默认的english分词器,没有匹配的是stardand标准分词器。
PUT /my_index { "mappings": { "my_type":{ "dynamic_templates":[ { "en":{ "match":"*_en", "match_mapping_type":"string", "mapping":{ "type":"text", "analyzer":"english" } } } ] } } }
PUT /my_index/my_type/1 { "title_en":"this is my dog" }
PUT /my_index/my_type/2 { "title":"this is my cat" }
测试:id为1的文档匹配了模板,使用的是english分词器,该分词器忽略了 is a an等单词,根据 is 不会查询出文档,如下:
GET /my_index/my_type/_search { "query":{ "match": { "title_en": "is" } } }
查询结果:
{ "took": 3, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 0, "max_score": null, "hits": [] } }
测试:id为2的文档没有匹配该模板,使用的是标准分词器,该分词器没有忽略 is a an等单词,根据 is 可以查询出文档,如下:
GET /my_index/my_type/_search { "query":{ "match": { "title": "is" } } }
查询结果:
{ "took": 4, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 0.2876821, "hits": [ { "_index": "my_index", "_type": "my_type", "_id": "2", "_score": 0.2876821, "_source": { "title": "this is my cat" } } ] } }