php搜索分词处理(jieba分词)

做搜索分词处理的时候,有几个常用的分词方法,jieba分词,scws分词,jieba分词是比较常用的分词

开启 php 扩展

pdo_sqlite

sqlite3

mbstring

1.先compose安装一下

composer require vanry/laravel-scout-tntsearch

2、添加 Provider 

'providers' => [

... /** * TNTSearch 全文搜索 */

Laravel\Scout\ScoutServiceProvider::class,

Vanry\Scout\TNTSearchScoutServiceProvider::class,

],

3、中文分词

composer require fukuball/jieba-php

4、发布配置项

php artisan vendor:publish --provider="Laravel\Scout\ScoutServiceProvider"

5、配置项 config/scout.php 中增加 tntsearch

 'tntsearch' => [
    'storage' => storage_path('indexes'), //必须有可写权限
    'fuzziness' => env('TNTSEARCH_FUZZINESS', false),
    'searchBoolean' => env('TNTSEARCH_BOOLEAN', false),
    'asYouType' => false,

    'fuzzy' => [
        'prefix_length' => 2,
        'max_expansions' => 50,
        'distance' => 2,
    ],

    'tokenizer' => [
        'driver' => env('TNTSEARCH_TOKENIZER', 'default'),

        'jieba' => [
            'dict' => 'small',
            //'user_dict' => resource_path('dicts/mydict.txt'), //自定义词典路径
        ],

        'analysis' => [
            'result_type' => 2,
            'unit_word' => true,
            'differ_max' => true,
        ],

        'scws' => [
            'charset' => 'utf-8',
            'dict' => '/usr/local/scws/etc/dict.utf8.xdb',
            'rule' => '/usr/local/scws/etc/rules.utf8.ini',
            'multi' => 1,
            'ignore' => true,
            'duality' => false,
        ],
    ],

    'stopwords' => [
        '的',
        '了',
        '而是',
    ],
],

6、.env 增加配置项

SCOUT_DRIVER=tntsearch

TNTSEARCH_TOKENIZER=jieba

7.控制器中:

public function search(){
    $data = Article::search('tnt')->get()->toArray();
    dd($data);
}

8.在模型中我们添加需要搜索的字段:

namespace App\Models;

use Illuminate\Database\Eloquent\Model;
use Laravel\Scout\Searchable;

class Article extends Model
{
    use Searchable;

    /**
     * 索引的字段
     *
     * @return array
     */
    public function toSearchableArray()
    {
        return $this->only('id', 'title', 'content');

        // return $this->toArray();
    }
}

9.生成索引:

php artisan scout:import "App\Model\Article"

10,访问路由得到数据

上一篇:python学习笔记之jieba库及词云的使用


下一篇:Python实战案例,jieba模块学习,简单品读小说