presto简介
Presto是一个分布式SQL查询引擎, 它被设计为用来专门进行高速、实时的数据分析。它支持标准的ANSI SQL,包括复杂查询、聚合(aggregation)、连接(join)和窗口函数(window functions)。它的产生是为了解决hive的MR太慢的问题,Presto 本身并不存储数据,但是可以接入多种数据源,并且支持跨数据源的级联查询。Presto是一个OLAP的工具,擅长对海量数据进行复杂的分析。
presto 部署
单点部署
start presto
docker run -itd -p 8080:8080 --name presto ahanaio/prestodb-sandbox:0.261
Bring up Presto-cli
docker exec -it presto01 presto-cli
访问presto语法
show catalogs;
SHOW schemas FROM tpch;
show tables from tpch.sf100;
use tpch.sf100;
其中 tpch为schema,相当于数据源;
sf100为库,相当于database
connector配置
1)配置TPCH connector:
etc/catalog/tpch.properties
connector.name=tpch
2)在etc/catalog目录下创建hive.properties,信息如下:
connector.name=hive-hadoop2
hive.metastore.uri=thrift://192.168.1.12:9083
hive.config.resources=/etc/hadoop/2.4.2.0-258/0/core-site.xml, /etc/hadoop/2.4.2.0-258/0/hdfs-site.xml
hive.allow-drop-table=true
集群部署
编辑配置文件
1、node.properties
node.environment=test
#node.id=node1
#node.id=node2
node.id=node3
node.data-dir=/opt/presto-server/data
2、jvm.config
-server
-Xmx16G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:+ExitOnOutOfMemoryError
-Djdk.attach.allowAttachSelf=true
3、config.properties
worker的相关配置
coordinator=false
http-server.http.port=8080
query.max-memory=50GB
query.max-memory-per-node=5GB
query.max-total-memory-per-node=10GB
discovery.uri=http://192.168.1.102:8080
其既可以充当coordinator ,也可以充当worker
coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8080
query.max-memory=50GB
query.max-memory-per-node=5GB
query.max-total-memory-per-node=10GB
discovery-server.enabled=true
discovery.uri=http://192.168.1.102:8080
集群模式需要注意:
集群模式的时候,注意的点就是node.id 不要相同 。node.environment的值相同。
然后再启动的时候,是先启动coordinator,然后再分别启动worker,原理一想也应该是coordinator启动后暴露服务,然后worker启动后注册这样的机制。
docker-compse编排文件
version: '3'
services:
presto01:
image: ahanaio/prestodb-sandbox:0.261
container_name: presto01
hostname: presto01
restart: "no"
ulimits:
nofile:
soft: "262144"
hard: "262144"
deploy:
resources:
limits:
cpus: '16.00'
memory: 64G
reservations:
cpus: '0.25'
memory: 100M
volumes:
- /data/adhoc/presto/node.properties:/opt/presto-server/etc/node.properties:rw
- /data/adhoc/presto/data:/opt/presto-server/data:rw
- /data/adhoc/presto/jvm.config:/opt/presto-server/etc/jvm.config:rw
- /data/adhoc/presto/config.properties:/opt/presto-server/etc/config.properties:rw
#network_mode: "host"
presto02:
image: ahanaio/prestodb-sandbox:0.261
container_name: presto02
hostname: presto02
restart: "no"
depends_on:
- presto01
ulimits:
nofile:
soft: "262144"
hard: "262144"
deploy:
resources:
limits:
cpus: '16.00'
memory: 64G
reservations:
cpus: '0.25'
memory: 100M
volumes:
- /data/adhoc/presto/node.properties:/opt/presto-server/etc/node.properties:rw
- /data/adhoc/presto/data:/opt/presto-server/data:rw
- /data/adhoc/presto/jvm.config:/opt/presto-server/etc/jvm.config:rw
- /data/adhoc/presto/config.properties:/opt/presto-server/etc/config.properties:rw
#network_mode: "host"
presto03:
image: ahanaio/prestodb-sandbox:0.261
container_name: presto03
hostname: presto03
restart: "no"
depends_on:
- presto01
ulimits:
nofile:
soft: "262144"
hard: "262144"
deploy:
resources:
limits:
cpus: '16.00'
memory: 64G
reservations:
cpus: '0.25'
memory: 100M
volumes:
- /data/adhoc/presto/node.properties:/opt/presto-server/etc/node.properties:rw
- /data/adhoc/presto/data:/opt/presto-server/data:rw
- /data/adhoc/presto/jvm.config:/opt/presto-server/etc/jvm.config:rw
- /data/adhoc/presto/config.properties:/opt/presto-server/etc/config.properties:rw
#network_mode: "host"