elasticsearch 中文分词（elasticsearch-analysis-ik）安装

2023-08-04 14:51:22

在elasticsearch的plugins目录下，创建ik目录

cd /usr/local/elasticsearch-6.3.0/plugins

mkdir ik

将解压的内容，放入其中

重新启动elasticsearch服务

elasticsearch restart

这个时候中文分词就生效了，数据重新插入即可

GET /megacorp/employee/_search

{

    "query" : {

        "match" : {

            "about" : "程序员 编程"

        }

    }

}

搜索结果

{

  "took": 8,

  "timed_out": false,

  "_shards": {

    "total": 5,

    "successful": 5,

    "skipped": 0,

    "failed": 0

  },

  "hits": {

    "total": 1,

    "max_score": 1.654172,

    "hits": [

      {

        "_index": "megacorp",

        "_type": "employee",

        "_id": "2",

        "_score": 1.654172,

        "_source": {

          "first_name": "张",

          "last_name": "三",

          "age": 24,

          "about": "一个PHP程序员，热爱编程，热爱生活，充满激情。",

          "interests": [

            "英雄联盟"

          ]

        }

      }

    ]

  }

}

或者通过（elasticsearch-plugin）在线安装，速度有点慢。

./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.3.0/elasticsearch-analysis-ik-6.3.0.zip

-> Downloading https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.3.0/elasticsearch-analysis-ik-6.3.0.zip

[=================================================] 100%  

-> Installed analysis-ik

发现多了一个文件夹

使用

GET _analyze?pretty

{

  "analyzer": "ik_smart",

  "text": "*国歌"

}

{

  "tokens": [

    {

      "token": "*",

      "start_offset": 0,

      "end_offset": 7,

      "type": "CN_WORD",

      "position": 0

    },

    {

      "token": "国歌",

      "start_offset": 7,

      "end_offset": 9,

      "type": "CN_WORD",

      "position": 1

    }

  ]

}

再一个例子

GET _analyze?pretty

{

  "analyzer": "ik_smart",

  "text": "王者荣耀是最好玩的游戏"

}

{

  "tokens": [

    {

      "token": "王者",

      "start_offset": 0,

      "end_offset": 2,

      "type": "CN_WORD",

      "position": 0

    },

    {

      "token": "荣耀",

      "start_offset": 2,

      "end_offset": 4,

      "type": "CN_WORD",

      "position": 1

    },

    {

      "token": "是",

      "start_offset": 4,

      "end_offset": 5,

      "type": "CN_CHAR",

      "position": 2

    },

    {

      "token": "最",

      "start_offset": 5,

      "end_offset": 6,

      "type": "CN_CHAR",

      "position": 3

    },

    {

      "token": "好玩",

      "start_offset": 6,

      "end_offset": 8,

      "type": "CN_WORD",

      "position": 4

    },

    {

      "token": "的",

      "start_offset": 8,

      "end_offset": 9,

      "type": "CN_CHAR",

      "position": 5

    },

    {

      "token": "游戏",

      "start_offset": 9,

      "end_offset": 11,

      "type": "CN_WORD",

      "position": 6

    }

  ]

}

码农公寓

相关文章