【ClickHouse数据库】基于Docker构建集群模式(多分片单备份)

集群模式(多分片单备份)

创建容器

  1. 创建网络

    docker network create -d bridge iot-net
    
  2. 启动3个数据库实例

    docker run -d --name chdb1 --ulimit nofile=262144:262144 --volume=/root/iot/chdb1:/var/lib/clickhouse --publish 9001:9000 --network iot-net yandex/clickhouse-server
    
    docker run -d --name chdb2 --ulimit nofile=262144:262144 --volume=/root/iot/chdb2:/var/lib/clickhouse --publish 9002:9000 --network iot-net yandex/clickhouse-server
    
    docker run -d --name chdb3 --ulimit nofile=262144:262144 --volume=/root/iot/chdb3:/var/lib/clickhouse --publish 9003:9000 --network iot-net yandex/clickhouse-server
    

配置集群模式

  1. 加载集群配置文件:先从容器中获得配置文件

    docker cp chdb1:/etc/clickhouse-server/config.xml ./
    
  2. config.xml自定义数据分片配置中定义3分片1备份:

    <remote_servers>
        <perftest_3shards_1replicas>
            <shard>
                <replica>
                    <host>chdb1</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <replica>
                    <host>chdb2</host>
                    <port>9000</port>
                </replica>
            </shard>
            <shard>
                <replica>
                    <host>chdb3</host>
                    <port>9000</port>
                </replica>
            </shard>
        </perftest_3shards_1replicas>
    </remote_servers>
    

    随后将config.xml配置文件导回至3个实例并重启之:

    docker cp ./config.xml chdb1:/etc/clickhouse-server && docker cp ./config.xml chdb2:/etc/clickhouse-server && docker cp ./config.xml chdb3:/etc/clickhouse-server
    docker restart chdb1 chdb2 chdb3
    

验证集群

  1. 连接至任意实例:

    clickhouse-client --port 9001
    

    执行以下命令可看到当前集群信息:

    e16ff05d1ca6 :) select * from system.clusters;
    
    SELECT *
    FROM system.clusters
    
    ┌─cluster───────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name─┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─estimated_recovery_time─┐
    │ cluster_3shards_1replicas │         1 │            1 │           1 │ chdb1     │ 172.25.0.2   │ 9000 │        1 │ default │                  │            0 │                       0 │
    │ cluster_3shards_1replicas │         2 │            1 │           1 │ chdb2     │ 172.25.0.3   │ 9000 │        0 │ default │                  │            0 │                       0 │
    │ cluster_3shards_1replicas │         3 │            1 │           1 │ chdb3     │ 172.25.0.4   │ 9000 │        0 │ default │                  │            0 │                       0 │
    └───────────────────────────┴───────────┴──────────────┴─────────────┴───────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────────────┘
    
    3 rows in set. Elapsed: 0.006 sec. 
    
  2. 在3个实例中构建测试数据表:

    create table population (
        `ozone` Int8,
        `particullate_matter` Int8,
        `carbon_monoxide` Int8,
        `sulfure_dioxide` Int8,
        `nitrogen_dioxide` Int8,
        `longitude` Float64,
        `latitude` Float64,
        `timestamp` DateTime
     ) ENGINE = MergeTree()
     ORDER BY `timestamp`
     PRIMARY KEY `timestamp`
    

    可以直接使用clinet创建表而不一一进入数据库执行SQL:

    clickhouse-client --port 9001 --query ‘CREATE TABLE population (
        `ozone` Int8,
        `particullate_matter` Int8,
        `carbon_monoxide` Int8,
        `sulfure_dioxide` Int8,
        `nitrogen_dioxide` Int8,
        `longitude` Float64,
        `latitude` Float64,
        `timestamp` DateTime
     ) ENGINE = MergeTree()
     ORDER BY `timestamp`
     PRIMARY KEY `timestamp`‘
    
  3. 创建分布表,分布表可以认为是一个路由,表明了数据如何流转至集群中具体的某一实例:

    CREATE TABLE population_all AS population
    ENGINE = Distributed(cluster_3shards_1replicas, default, population, rand())
    
  4. 将数据导入到此数据库实例的表中:

    root@mq-227 ~/i/db_file cat pollutionData204273.csv | wc -l
    17568
    
    clickhouse-client --port 9001 --query "INSERT INTO population_all FORMAT CSV" < ./pollutionData204273.csv  
    

    查询数据表可得当前数据量:

    root@mq-227 ~/i/db_file# clickhouse-client --port 9001 --query "select count(*) from population_all"                                      1
    17568
    root@mq-227 ~/i/db_file# clickhouse-client --port 9001 --query "select count(*) from population"    
    5955
    root@mq-227 ~/i/db_file# clickhouse-client --port 9002 --query "select count(*) from population"
    5690
    root@mq-227 ~/i/db_file# clickhouse-client --port 9003 --query "select count(*) from population"
    5923
    

    可以看到数据已经被分配至3个分片中。

【ClickHouse数据库】基于Docker构建集群模式(多分片单备份)

上一篇:Linux下dmp导入到Oracle


下一篇:Ubuntu 安装mysql