phper敲开java的大门-Elasticsearch搜索引擎

本文目标

了解Elasticsearch,并简单的运用到springboot项目中。

本次主角

Elasticsearch(开源,分布式,RESTful搜索引擎)

github地址(https://github.com/elastic/elasticsearch)

初识Elasticsearch

笔者早期参与的php项目并没有涉及到搜索方面,就算有也是比较简单的使用一下 like 语句来实现搜索功能。
Elasticsearch这个名词倒是早有耳闻,不过当时一来业务场景用不到,二来它在java领域的使用更广泛,直到后来需要做用户行为日志分析才简单的用到了它。

Elasticsearch的应用场景

日志分析(用户行为、监控、安全、业务等等)、搜索业务。

简单介绍一下Elasticsearch在php项目中的使用

当时的业务场景是需要做用户的行为分析。我们在前端进行了埋点,通过收集用户的访问路径来进行分析。

技术实现
  • 异步日志写入服务{可集群}
  • logstash(统一处理,转换日志)
  • elasticsearch(索引、存储)
  • 异步api提供处理结果
  • ui展示(前端)

异步的收集与api都是用的笔者自己写的php异步框架GroupCo

贴一下部分代码

写入日志

        $record = $log['ip'].' ['.$log['time'].'] "'.$log['url'].'" "'.$log['referrer'].'" "'.$log['agent'].'" "'.$log['uuid'].'" "'.$log['device']."\"\n";
        yield AsyncFile::write(__ROOT__."runtime/data/".date('Ymd').".log", $record, FILE_APPEND);

logstash配置处理日志

    input {
      file {
        path => "/var/www/log/runtime/data/*"
        start_position => "beginning"
      }
    }
    
    filter {
      mutate { replace => { "type" => "access" } }
      grok {
        match=>{ "message"=>"%{IP:clientip} \[%{HTTPDATE:timestamp}\] \"%{NOTSPACE:request}\" \"(?:%{URI:referrer}|-)\" \"%{GREEDYDATA:agent}\" \"%{NOTSPACE:uuid}\" \"%{NOTSPACE:device}\"" }
      }
      date {
        match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
      }
    }
    
    output {
      elasticsearch {
        hosts => ["localhost:9200"]
      }
      stdout { codec => rubydebug }
    }

通过Elasticsearch的搜索API进行相关查询(搜索参数可以到官方文档查阅).

比如列出当天用户(以uuidl区分)

        $http = new AsyncHttp('http://127.0.0.1:9200');
        yield $http->parseDomain();
        $client = $http->getClient("/logstash-{$date}/_search");
        $client->setMethod("GET");
        $client->setData(' { "size" : 0, "aggs" : { "device" : { "terms" : { "field" : "device.keyword" }, "aggs" : { "uuids" : { "terms" : { "field" : "uuid.keyword", "size": 1000 } } } } } }');
        $client->setHeaders(['Content-Type' => 'application/json']);
        $res = (yield $client);

从php项目来简单了解了一下Elasticsearch在日志分析时的使用。有兴趣的phper可以自己尝试一下。

重点说说springboot中使用Elasticsearch

在springboot中使用spring-data-elasticsearch其实挺简单的,大概分为4步骤:

1.定义需要建立的表索引与关键字通过关键@Document,部分关键代码:
package com.clothesmake.user.dao.entity;

import lombok.Data;
import org.hibernate.annotations.DynamicInsert;
import org.hibernate.annotations.DynamicUpdate;
import org.springframework.data.elasticsearch.annotations.Document;

import javax.persistence.*;
import java.io.Serializable;

/**
 *  user
 * @author coco
 */
@Entity
@Data
@DynamicInsert
@DynamicUpdate
@Table(name = "user")
@Document(indexName = "user")
public class UserEntity implements Serializable {
    private static final long serialVersionUID = 1L;

    /**
     * id
     */
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Integer id;

    /**
     * nickname
     */
    private String nickname;

    /**
     * 手机
     */
    private String mobile;

    /**
     * email
     */
    private String email;

    /**
     * password
     */
    private String password;

    public UserEntity() {
    }

}
2.创建一个继承与ElasticsearchRepository的接口类
package com.clothesmake.user.dao.repository.search;

import com.clothesmake.user.dao.entity.UserEntity;
import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;

public interface UserSearchRepository extends ElasticsearchRepository<UserEntity, Integer> {
}

这里有一个坑,就是ElasticsearchRepositories与其他JpaRepositories应该放到不同的包下面,然后通过配置分别设置注入解析不同的包。
不然会报错"No property index found for type user"

@EnableJpaRepositories("com.clothesmake.user.dao.repository")
@EnableElasticsearchRepositories("com.clothesmake.user.dao.repository.search")
3.在UserService里面添加一个searchUser方法即可
public Page<UserEntity> searchUsers(String query, Pageable pageable) {
    Page<UserEntity> users = userSearchRepository.search(queryStringQuery(query), pageable);

    return users;
}
4.配置好cluster地址
spring.data.elasticsearch.cluster-nodes=127.0.0.1:9300

Query

Keyword Sample Elasticsearch Query String
And findByNameAndPrice {"bool" : {"must" : [ {"field" : {"name" : "?"}}, {"field" : {"price" : "?"}} ]}}
Or findByNameOrPrice {"bool" : {"should" : [ {"field" : {"name" : "?"}}, {"field" : {"price" : "?"}} ]}}
Is findByName {"bool" : {"must" : {"field" : {"name" : "?"}}}}
Not findByNameNot {"bool" : {"must_not" : {"field" : {"name" : "?"}}}}
Between findByPriceBetween {"bool" : {"must" : {"range" : {"price" : {"from" : ?,"to" : ?,"include_lower" : true,"include_upper" : true}}}}}
LessThanEqual findByPriceLessThan {"bool" : {"must" : {"range" : {"price" : {"from" : null,"to" : ?,"include_lower" : true,"include_upper" : true}}}}}
GreaterThanEqual findByPriceGreaterThan {"bool" : {"must" : {"range" : {"price" : {"from" : ?,"to" : null,"include_lower" : true,"include_upper" : true}}}}}
Before findByPriceBefore {"bool" : {"must" : {"range" : {"price" : {"from" : null,"to" : ?,"include_lower" : true,"include_upper" : true}}}}}
After findByPriceAfter {"bool" : {"must" : {"range" : {"price" : {"from" : ?,"to" : null,"include_lower" : true,"include_upper" : true}}}}}
Like findByNameLike {"bool" : {"must" : {"field" : {"name" : {"query" : "?*","analyze_wildcard" : true}}}}}
StartingWith findByNameStartingWith {"bool" : {"must" : {"field" : {"name" : {"query" : "?*","analyze_wildcard" : true}}}}}
EndingWith findByNameEndingWith {"bool" : {"must" : {"field" : {"name" : {"query" : "*?","analyze_wildcard" : true}}}}}
Contains/Containing findByNameContaining {"bool" : {"must" : {"field" : {"name" : {"query" : "?","analyze_wildcard" : true}}}}}
In findByNameIn(Collectionnames) {"bool" : {"must" : {"bool" : {"should" : [ {"field" : {"name" : "?"}}, {"field" : {"name" : "?"}} ]}}}}
NotIn findByNameNotIn(Collectionnames) {"bool" : {"must_not" : {"bool" : {"should" : {"field" : {"name" : "?"}}}}}}
True findByAvailableTrue {"bool" : {"must" : {"field" : {"available" : true}}}}
False findByAvailableFalse {"bool" : {"must" : {"field" : {"available" : false}}}}
OrderBy findByAvailableTrueOrderByNameDesc {"sort" : [{ "name" : {"order" : "desc"} }],"bool" : {"must" : {"field" : {"available" : true}}}}
上一篇:20个令人惊叹的深度学习应用(Demo+Paper+Code)


下一篇:phper如何有计划,高效率,优简历应对面试?