presto部署(容器化)

presto简介

Presto是一个分布式SQL查询引擎, 它被设计为用来专门进行高速、实时的数据分析。它支持标准的ANSI SQL,包括复杂查询、聚合(aggregation)、连接(join)和窗口函数(window functions)。它的产生是为了解决hive的MR太慢的问题,Presto 本身并不存储数据,但是可以接入多种数据源,并且支持跨数据源的级联查询。Presto是一个OLAP的工具,擅长对海量数据进行复杂的分析。

presto 部署

单点部署

start presto

docker run -itd -p 8080:8080 --name presto ahanaio/prestodb-sandbox:0.261

Bring up Presto-cli

docker exec -it presto01  presto-cli

访问presto语法

show catalogs;
SHOW schemas FROM tpch;
show tables from tpch.sf100;
use tpch.sf100;

其中 tpch为schema,相当于数据源;
sf100为库,相当于database

connector配置

1)配置TPCH connector:
    etc/catalog/tpch.properties
    connector.name=tpch

2)在etc/catalog目录下创建hive.properties,信息如下:
    connector.name=hive-hadoop2
    hive.metastore.uri=thrift://192.168.1.12:9083
    hive.config.resources=/etc/hadoop/2.4.2.0-258/0/core-site.xml, /etc/hadoop/2.4.2.0-258/0/hdfs-site.xml
    hive.allow-drop-table=true

集群部署

编辑配置文件
1、node.properties

node.environment=test
#node.id=node1
#node.id=node2
node.id=node3
node.data-dir=/opt/presto-server/data

2、jvm.config

-server
-Xmx16G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:+ExitOnOutOfMemoryError
-Djdk.attach.allowAttachSelf=true

3、config.properties
worker的相关配置

coordinator=false
http-server.http.port=8080
query.max-memory=50GB
query.max-memory-per-node=5GB
query.max-total-memory-per-node=10GB
discovery.uri=http://192.168.1.102:8080

其既可以充当coordinator ,也可以充当worker

coordinator=true
node-scheduler.include-coordinator=true
http-server.http.port=8080
query.max-memory=50GB
query.max-memory-per-node=5GB
query.max-total-memory-per-node=10GB
discovery-server.enabled=true
discovery.uri=http://192.168.1.102:8080

集群模式需要注意:
集群模式的时候,注意的点就是node.id 不要相同 。node.environment的值相同。
然后再启动的时候,是先启动coordinator,然后再分别启动worker,原理一想也应该是coordinator启动后暴露服务,然后worker启动后注册这样的机制。

docker-compse编排文件

version: '3'
services:

  presto01:
    image: ahanaio/prestodb-sandbox:0.261
    container_name: presto01
    hostname: presto01
    restart: "no"
    ulimits:
      nofile:
        soft: "262144"
        hard: "262144"
    deploy:
      resources:
        limits:
          cpus: '16.00'
          memory: 64G
        reservations:
          cpus: '0.25'
          memory: 100M
    volumes:
      - /data/adhoc/presto/node.properties:/opt/presto-server/etc/node.properties:rw
      - /data/adhoc/presto/data:/opt/presto-server/data:rw
      - /data/adhoc/presto/jvm.config:/opt/presto-server/etc/jvm.config:rw
      - /data/adhoc/presto/config.properties:/opt/presto-server/etc/config.properties:rw
    #network_mode: "host"
    
  presto02:
    image: ahanaio/prestodb-sandbox:0.261
    container_name: presto02
    hostname: presto02
    restart: "no"
    depends_on:
      - presto01
    ulimits:
      nofile:
        soft: "262144"
        hard: "262144"
    deploy:
      resources:
        limits:
          cpus: '16.00'
          memory: 64G
        reservations:
          cpus: '0.25'
          memory: 100M
    volumes:
      - /data/adhoc/presto/node.properties:/opt/presto-server/etc/node.properties:rw
      - /data/adhoc/presto/data:/opt/presto-server/data:rw
      - /data/adhoc/presto/jvm.config:/opt/presto-server/etc/jvm.config:rw
      - /data/adhoc/presto/config.properties:/opt/presto-server/etc/config.properties:rw
    #network_mode: "host"
    

  presto03:
    image: ahanaio/prestodb-sandbox:0.261
    container_name: presto03
    hostname: presto03
    restart: "no"
    depends_on:
      - presto01
    ulimits:
      nofile:
        soft: "262144"
        hard: "262144"
    deploy:
      resources:
        limits:
          cpus: '16.00'
          memory: 64G
        reservations:
          cpus: '0.25'
          memory: 100M
    volumes:
      - /data/adhoc/presto/node.properties:/opt/presto-server/etc/node.properties:rw
      - /data/adhoc/presto/data:/opt/presto-server/data:rw
      - /data/adhoc/presto/jvm.config:/opt/presto-server/etc/jvm.config:rw
      - /data/adhoc/presto/config.properties:/opt/presto-server/etc/config.properties:rw
    #network_mode: "host"
上一篇:shell脚本--eval执行shell命令


下一篇:kvm上的Linux虚拟机使用virtio磁盘