集群模式(多分片单备份)
创建容器
-
创建网络
docker network create -d bridge iot-net
-
启动3个数据库实例
docker run -d --name chdb1 --ulimit nofile=262144:262144 --volume=/root/iot/chdb1:/var/lib/clickhouse --publish 9001:9000 --network iot-net yandex/clickhouse-server docker run -d --name chdb2 --ulimit nofile=262144:262144 --volume=/root/iot/chdb2:/var/lib/clickhouse --publish 9002:9000 --network iot-net yandex/clickhouse-server docker run -d --name chdb3 --ulimit nofile=262144:262144 --volume=/root/iot/chdb3:/var/lib/clickhouse --publish 9003:9000 --network iot-net yandex/clickhouse-server
配置集群模式
-
加载集群配置文件:先从容器中获得配置文件
docker cp chdb1:/etc/clickhouse-server/config.xml ./
-
在
config.xml
自定义数据分片配置中定义3分片1备份:<remote_servers> <perftest_3shards_1replicas> <shard> <replica> <host>chdb1</host> <port>9000</port> </replica> </shard> <shard> <replica> <host>chdb2</host> <port>9000</port> </replica> </shard> <shard> <replica> <host>chdb3</host> <port>9000</port> </replica> </shard> </perftest_3shards_1replicas> </remote_servers>
随后将
config.xml
配置文件导回至3个实例并重启之:docker cp ./config.xml chdb1:/etc/clickhouse-server && docker cp ./config.xml chdb2:/etc/clickhouse-server && docker cp ./config.xml chdb3:/etc/clickhouse-server docker restart chdb1 chdb2 chdb3
验证集群
-
连接至任意实例:
clickhouse-client --port 9001
执行以下命令可看到当前集群信息:
e16ff05d1ca6 :) select * from system.clusters; SELECT * FROM system.clusters ┌─cluster───────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name─┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─estimated_recovery_time─┐ │ cluster_3shards_1replicas │ 1 │ 1 │ 1 │ chdb1 │ 172.25.0.2 │ 9000 │ 1 │ default │ │ 0 │ 0 │ │ cluster_3shards_1replicas │ 2 │ 1 │ 1 │ chdb2 │ 172.25.0.3 │ 9000 │ 0 │ default │ │ 0 │ 0 │ │ cluster_3shards_1replicas │ 3 │ 1 │ 1 │ chdb3 │ 172.25.0.4 │ 9000 │ 0 │ default │ │ 0 │ 0 │ └───────────────────────────┴───────────┴──────────────┴─────────────┴───────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────────────┘ 3 rows in set. Elapsed: 0.006 sec.
-
在3个实例中构建测试数据表:
create table population ( `ozone` Int8, `particullate_matter` Int8, `carbon_monoxide` Int8, `sulfure_dioxide` Int8, `nitrogen_dioxide` Int8, `longitude` Float64, `latitude` Float64, `timestamp` DateTime ) ENGINE = MergeTree() ORDER BY `timestamp` PRIMARY KEY `timestamp`
可以直接使用clinet创建表而不一一进入数据库执行SQL:
clickhouse-client --port 9001 --query ‘CREATE TABLE population ( `ozone` Int8, `particullate_matter` Int8, `carbon_monoxide` Int8, `sulfure_dioxide` Int8, `nitrogen_dioxide` Int8, `longitude` Float64, `latitude` Float64, `timestamp` DateTime ) ENGINE = MergeTree() ORDER BY `timestamp` PRIMARY KEY `timestamp`‘
-
创建分布表,分布表可以认为是一个路由,表明了数据如何流转至集群中具体的某一实例:
CREATE TABLE population_all AS population ENGINE = Distributed(cluster_3shards_1replicas, default, population, rand())
-
将数据导入到此数据库实例的表中:
root@mq-227 ~/i/db_file cat pollutionData204273.csv | wc -l 17568 clickhouse-client --port 9001 --query "INSERT INTO population_all FORMAT CSV" < ./pollutionData204273.csv
查询数据表可得当前数据量:
root@mq-227 ~/i/db_file# clickhouse-client --port 9001 --query "select count(*) from population_all" 1 17568 root@mq-227 ~/i/db_file# clickhouse-client --port 9001 --query "select count(*) from population" 5955 root@mq-227 ~/i/db_file# clickhouse-client --port 9002 --query "select count(*) from population" 5690 root@mq-227 ~/i/db_file# clickhouse-client --port 9003 --query "select count(*) from population" 5923
可以看到数据已经被分配至3个分片中。