8.Lucene多字段搜索

2022-07-13 17:23:29

一、匹配多字段搜索

在有些情况下，我们需要把输入的关键字在多个字段中进行匹配。

例如：我们需要在“title”和“content”字段中搜索“组件刷新”这个关键字。

// 多字段查询，同时在多个字段中进行查询，查询的关键字会进行分词
String[] fields = {"title", "content"};
MultiFieldQueryParser multiFieldQuery = new MultiFieldQueryParser(fields, new IKAnalyzer());
Query query = multiFieldQuery.parse("组件刷新");

组装好查询对象Query后，我们只要把query传给IndexSearcher进行搜索即可。

// 创建一个索引查询对象
DirectoryReader reader = DirectoryReader.open(fsd);
IndexSearcher searcher = new IndexSearcher(reader);
// 搜索
TopDocs docs = searcher.search(query, 10);

附录：完整代码

@Test
public void multiTermQuery() throws ParseException {
    // lucene索引目录位置
    String indexDir = "E:\\develop\\demo\\lucene-learn\\lucene-index";
    File luceneIndexDirectory = new File(indexDir);
    // 打开索引目录
    try (FSDirectory fsd = FSDirectory.open(luceneIndexDirectory.toPath())) {
        // 创建一个索引查询对象
        DirectoryReader reader = DirectoryReader.open(fsd);
        IndexSearcher searcher = new IndexSearcher(reader);

        // 多字段查询，同时在多个字段中进行查询，查询的关键字会进行分词
        String[] fields = {"title", "content"};
        MultiFieldQueryParser multiFieldQuery = new MultiFieldQueryParser(fields, new IKAnalyzer());
        Query query = multiFieldQuery.parse("组件刷新");

        // 搜索
        TopDocs docs = searcher.search(query, 10);

        // 打印
        for (ScoreDoc doc : docs.scoreDocs) {
            Document document = searcher.doc(doc.doc);
            System.out.println(document);
        }

        // 关闭查询Reader
        reader.close();
    } catch (IOException e) {
        System.err.println("打开索引目录失败");
        e.printStackTrace();
    }
}

二、多条件筛选式搜索

筛选式搜索，就像淘宝一样，在进行搜索的时候选择几个条件。

例如：在搜索手机时，我们指定品牌为“华为”，然后搜索 ”5G手机“，这种就是多条件筛选式搜索。

搜索的条件有不同可以制定规则进行布尔运算：
MUST 与
MUST_NOT 非
SHOULD 或
FILTER 过滤

我们可以通过2种方式实现：

1. 组合查询

定义多个查询条件，然后组合起来，使用BooleanQuery进行布尔运行。

// 查询 title 包含 刷新
QueryParser titleQueryParser = new QueryParser("title", new IKAnalyzer());
Query titleQuery = titleQueryParser.parse("刷新");
// 查询 content 包含 缓存
QueryParser contentQueryParser = new QueryParser("content", new IKAnalyzer());
Query contentQuery = contentQueryParser.parse("缓存");
// 布尔运算：
// MUST + MUST = 求2个都有的部分
Query query = new BooleanQuery.Builder()
    .add(titleQuery, BooleanClause.Occur.MUST)
    .add(contentQuery, BooleanClause.Occur.MUST)
    .build();

2. 多字段查询

多字段查询也可以支持布尔运算。

// 查询字段
String[] fields = {"title", "content"};
// 查询关键字
String[] stringQuery = {"刷新", "缓存"};
// title=刷新 and content=缓存
BooleanClause.Occur[] flags = {BooleanClause.Occur.MUST, BooleanClause.Occur.MUST};
Query query = MultiFieldQueryParser.parse(stringQuery, fields, flags, new IKAnalyzer());

附录：完整代码

@Test
public void screeningQuery() throws ParseException {
    // lucene索引目录位置
    String indexDir = "E:\\develop\\demo\\lucene-learn\\lucene-index";
    File luceneIndexDirectory = new File(indexDir);
    // 打开索引目录
    try (FSDirectory fsd = FSDirectory.open(luceneIndexDirectory.toPath())) {
        // 创建一个索引查询对象
        DirectoryReader reader = DirectoryReader.open(fsd);
        IndexSearcher searcher = new IndexSearcher(reader);

        // 筛选式搜索，类似：title=xxx and content=xxx
        // 布尔查询规则，表示查询的结果要如何进行布尔运算
        // 1.MUST       与
        // 2.MUST_NOT   不
        // 3.SHOULD     或
        // 4.FILTER     非

        /*
         // 方式一、布尔组合查询
         QueryParser titleQueryParser = new QueryParser("title", new IKAnalyzer());
         Query titleQuery = titleQueryParser.parse("刷新");
         QueryParser contentQueryParser = new QueryParser("content", new IKAnalyzer());
         Query contentQuery = contentQueryParser.parse("缓存");
         // 布尔运算
         Query query = new BooleanQuery.Builder()
               .add(titleQuery, BooleanClause.Occur.MUST)
               .add(contentQuery, BooleanClause.Occur.MUST)
               .build();
         */

        // 方式二、多字段查询
        // 查询字段
        String[] fields = {"title", "content"};
        // 查询关键字
        String[] stringQuery = {"刷新", "缓存"};
        // title=刷新 and content=缓存
        BooleanClause.Occur[] flags = {BooleanClause.Occur.MUST, BooleanClause.Occur.MUST};
        Query query = MultiFieldQueryParser.parse(stringQuery, fields, flags, new IKAnalyzer());

        // 搜索
        TopDocs docs = searcher.search(query, 10);

        // 打印
        for (ScoreDoc doc : docs.scoreDocs) {
            Document document = searcher.doc(doc.doc);
            System.out.println(document);
        }

        // 关闭查询Reader
        reader.close();
    } catch (IOException e) {
        System.err.println("打开索引目录失败");
        e.printStackTrace();
    }
}

8.Lucene多字段搜索

码农公寓

一、匹配多字段搜索

附录：完整代码

二、多条件筛选式搜索

1. 组合查询

2. 多字段查询

附录：完整代码

相关文章