Elasticsearch-Spark
添加es.net.http.auth.user
和es.net.http.auth.pass
配置参数,例如:
val conf = new SparkConf()
.setAppName(this.getClass.getSimpleName)
.setMaster("local[*]")
.set("es.index.auto.create", "true")
.set("es.nodes", "127.0.0.1")
.set("es.port", "9200")
.set("es.net.http.auth.user", "test")
.set("es.net.http.auth.pass", "123456")
val spark = SparkSession.builder().config(conf).getOrCreate()
Elasticsearch-Python
首先我们看一下python代码中官方给出的说明,如下:
Elasticsearch low-level client. Provides a straightforward mapping from
Python to ES REST endpoints.
The instance has attributes ``cat``, ``cluster``, ``indices``, ``ingest``,
``nodes``, ``snapshot`` and ``tasks`` that provide access to instances of
:class:`~elasticsearch.client.CatClient`,
:class:`~elasticsearch.client.ClusterClient`,
:class:`~elasticsearch.client.IndicesClient`,
:class:`~elasticsearch.client.IngestClient`,
:class:`~elasticsearch.client.NodesClient`,
:class:`~elasticsearch.client.SnapshotClient` and
:class:`~elasticsearch.client.TasksClient` respectively. This is the
preferred (and only supported) way to get access to those classes and their
methods.
You can specify your own connection class which should be used by providing
the ``connection_class`` parameter::
# create connection to localhost using the ThriftConnection
es = Elasticsearch(connection_class=ThriftConnection)
If you want to turn on :ref:`sniffing` you have several options (described
in :class:`~elasticsearch.Transport`)::
# create connection that will automatically inspect the cluster to get
# the list of active nodes. Start with nodes running on 'esnode1' and
# 'esnode2'
es = Elasticsearch(
['esnode1', 'esnode2'],
# sniff before doing anything
sniff_on_start=True,
# refresh nodes after a node fails to respond
sniff_on_connection_fail=True,
# and also every 60 seconds
sniffer_timeout=60
)
Different hosts can have different parameters, use a dictionary per node to
specify those::
# connect to localhost directly and another node using SSL on port 443
# and an url_prefix. Note that ``port`` needs to be an int.
es = Elasticsearch([
{'host': 'localhost'},
{'host': 'othernode', 'port': 443, 'url_prefix': 'es', 'use_ssl': True},
])
If using SSL, there are several parameters that control how we deal with
certificates (see :class:`~elasticsearch.Urllib3HttpConnection` for
detailed description of the options)::
es = Elasticsearch(
['localhost:443', 'other_host:443'],
# turn on SSL
use_ssl=True,
# make sure we verify SSL certificates
verify_certs=True,
# provide a path to CA certs on disk
ca_certs='/path/to/CA_certs'
)
SSL client authentication is supported
(see :class:`~elasticsearch.Urllib3HttpConnection` for
detailed description of the options)::
es = Elasticsearch(
['localhost:443', 'other_host:443'],
# turn on SSL
use_ssl=True,
# make sure we verify SSL certificates
verify_certs=True,
# provide a path to CA certs on disk
ca_certs='/path/to/CA_certs',
# PEM formatted SSL client certificate
client_cert='/path/to/clientcert.pem',
# PEM formatted SSL client key
client_key='/path/to/clientkey.pem'
)
Alternatively you can use RFC-1738 formatted URLs, as long as they are not
in conflict with other options::
es = Elasticsearch(
[
'http://user:secret@localhost:9200/',
'https://user:secret@other_host:443/production'
],
verify_certs=True
)
By default, `JSONSerializer
<https://github.com/elastic/elasticsearch-py/blob/master/elasticsearch/serializer.py#L24>`_
is used to encode all outgoing requests.
However, you can implement your own custom serializer::
from elasticsearch.serializer import JSONSerializer
class SetEncoder(JSONSerializer):
def default(self, obj):
if isinstance(obj, set):
return list(obj)
if isinstance(obj, Something):
return 'CustomSomethingRepresentation'
return JSONSerializer.default(self, obj)
es = Elasticsearch(serializer=SetEncoder())
根据上面的说明,结合实际情况,可以按如下方式创建客户端:
from elasticsearch import Elasticsearch
es = Elasticsearch(
[
'http://test:123456@127.0.0.1:9200/',
'http://<user>:<pass>@<host>:<port>/',
],
verify_certs=True
)
# 此方法为没有配置CA证书的情况,若配置了CA证书,请结合上述官方说明
Elasticsearch-CURL
通过CURL请求获取数据,直接上代码:
# 方法一:
curl -XGET "http://<user>:<pass>@<host>:<port>"
# 例如:
curl -XGET "http://test:123456@127.0.0.1:9200/test/_search"
# 方法二:
curl -u <user>:<pass> -XGET "http://<host>:<port>"
# 或
curl --user <user>:<pass> -XGET "http://<host>:<port>"
# 例如:
curl -u test:123456 -XGET "http://127.0.0.1:9200/test/_search"
curl --user test:123456 -XGET "http://127.0.0.1:9200/test/_search"