OushuDB 数据库基本用法 (上)

1、启动/停止OushuDB

启动OushuDB有两种方式,一种是通过”hawq start cluster”命令来启动整个集群,包括master和segment。启动哪些segment是由”/hawq-install-path/etc/slaves”中包含的节点确定的。

source /usr/local/hawq/greenplum_path.sh # 设置OushuDB环境变量hawq start cluster # 启动整个OushuDB集群

另外一种方式是分别启动OushuDB master和segment。因为OushuDB master和segment是解耦合的,分别启动master和segment是可行的。

hawq start master # 启动master,指的是启动本地masterhawq start segment # 启动segment,指的是启动本地segment

重新启动或者停止OushuDB也有两种方式:

# 方式一hawq restart cluster # 重启OushuDB集群hawq stop cluster # 停止OushuDB集群# 方式二hawq restart master # 重启本机的OushuDB masterhawq restart segment # 重启本机的OushuDB segmenthawq stop master # 停止本机OushuDB masterhawq stop segment # 停止本机OushuDB segment

启动/停止Magma

OushuDB4.0 实现了单独起停Magma服务,具体命令如下:

# 方式一 OushuDB4.0 集群起停带Magma服务 [只有hawq init|start|stop cluster命令可以带--with_magma选项]hawq init cluster --with_magma # 启动OushuDB集群时,使用--with_magma选项,同时启动Magma服务, 3.X版本不支持。# 方式二 Magma服务单独起停magma start|stop|restart clustermagma start|stop|restart node

关于OushuDB hawq命令的详细用法,可以通过”hawq –help”命令得到。

changlei:build ChangLei$ hawq --help

 usage: hawq <command> [<object>] [options]
         [--version]

The most commonly used hawq "commands" are:
start         Start hawq service.
stop          Stop hawq service.
init          Init hawq service.
restart       Restart hawq service.
activate      Activate hawq standby master as master.
version       Show hawq version information.
config        Set hawq GUC values.
state         Show hawq cluster status.
filespace     Create hawq filespaces.
extract       Extract table metadata into a YAML formatted file.
load          Load data into hawq.
scp           Copies files between multiple hosts at once.
ssh           Provides ssh access to multiple hosts at once.
ssh-exkeys    Exchanges SSH public keys between hosts.
check         Verifies and validates HAWQ settings.
checkperf     Verifies the baseline hardware performance of hosts.
register      Register parquet files generated by other system into the corrsponding table in HAWQ

See 'hawq <command> help' for more information on a specific command.

2、创建数据库和表

本节通过使用OushuDB的命令行工具psql来说明如何创建基本数据库对象:database和table。因为OushuDB和PostgreSQL兼容,所以使用OushuDB的方式和使用PostgresSQL的方式基本相同,如果OushuDB的文档有些地方说明不清楚的话,用户也可以通过查阅PostgresSQL的帮助文档来了解更多关于OushuDB的信息。

下面这条命令使用psql连接OushuDB缺省安装的数据库postgres,然后创建一个新的数据库test,并在新的数据库中创建一个表foo。

changlei:build ChangLei$ psql -d postgres
 psql (8.2.15)
 Type "help" for help.

 postgres=# create database test;  # 创建数据库test
 CREATE DATABASE

 postgres=# \c test  # 连接进入test数据库
 You are now connected to database "test" as user "ChangLei".

 test=# create table foo(id int, name varchar);  # 创建表foo
 CREATE TABLE

 test=# \d  # 显示当前数据库test中所有表
            List of relations
 Schema | Name | Type  |  Owner   |   Storage
--------+------+-------+----------+-------------
 public | foo  | table | ChangLei | append only
 (1 row)


 test=# insert into foo values(1, 'hawq'),(2, 'hdfs');
 INSERT 0 2

 test=# select * from foo; # 从表foo中选择数据
  id | name
 ----+------
   1 | hawq
   2 | hdfs
 (2 rows)

 如果想删除表或者数据库的话可以使用drop语句。

 test=# drop table foo;
 DROP TABLE

 test=# \d
 No relations found.

 test=# drop database test;  # 因为现在在test数据库中,所以不能删除
 ERROR:  cannot drop the currently open database

 test=# \c postgres  # 首先连接到postgres数据库,然后删除test数据库
 You are now connected to database "postgres" as user "ChangLei".

 postgres=# drop database test;
 DROP DATABASE

3、查看查询执行情况

使用\timing命令可以打印出查询执行的时间。

 test=# \timing on
 Timing is on.

 test=# select * from foo; # 这时再执行SQL语句会给出语句执行时间。
  id | name
 ----+------
   1 | hawq
   2 | hdfs
 (2 rows)

 Time: 16.369 ms

 test=# \timing off  # 关闭时间输出
 Timing is off.

 使用explain语句可以显示出查询计划。
 test=# explain select count(*) from foo;
                                     QUERY PLAN
 ----------------------------------------------------------------------------------
  Aggregate  (cost=1.07..1.08 rows=1 width=8)
    ->  Gather Motion 1:1  (slice1; segments: 1)  (cost=1.03..1.06 rows=1 width=8)
      ->  Aggregate  (cost=1.03..1.04 rows=1 width=8)
            ->  Append-only Scan on foo  (cost=0.00..1.02 rows=2 width=0)
  Settings:  default_hash_table_bucket_number=6
 (5 rows)


使用explain analyze可以显示出查询在具体执行时的状态,包括每一个操作符开始执行时间,以及结束时间,可以帮助用户找到查询的瓶颈,进而优化查询。关于查询计划以及explain analyze的执行结果的解释可以参考查询计划与查询执行章节。针对一个查询,可能会有无数个查询计划。得出优化的查询计划是查询优化器的功能。一个查询执行时间的长短与查询的计划有很大关系,所以熟悉查询计划以及具体查询的执行对查询优化有很大意义。

test=# explain analyze select count(*) from foo;
 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate  (cost=1.07..1.08 rows=1 width=8)
Rows out:  Avg 1.0 rows x 1 workers.  Max/Last(seg-1:changlei/seg-1:changlei) 1/1 rows with 5.944/5.944 ms to end, start offset by 6.568/6.568 ms.
->  Gather Motion 1:1  (slice1; segments: 1)  (cost=1.03..1.06 rows=1 width=8)
      Rows out:  Avg 1.0 rows x 1 workers at destination.  Max/Last(seg-1:changlei/seg-1:changlei) 1/1 rows with 5.941/5.941 ms to first row, 5.942/5.942 ms to end, start offset by 6.569/6.569 ms.
      ->  Aggregate  (cost=1.03..1.04 rows=1 width=8)
            Rows out:  Avg 1.0 rows x 1 workers.  Max/Last(seg0:changlei/seg0:changlei) 1/1 rows with 5.035/5.035 ms to first row, 5.036/5.036 ms to end, start offset by 7.396/7.396 ms.
            ->  Append-only Scan on foo  (cost=0.00..1.02 rows=2 width=0)
                  Rows out:  Avg 2.0 rows x 1 workers.  Max/Last(seg0:changlei/seg0:changlei) 2/2 rows with 5.011/5.011 ms to first row, 5.032/5.032 ms to end, start offset by 7.397/7.397 ms.
Slice statistics:
  (slice0)    Executor memory: 223K bytes.
  (slice1)    Executor memory: 279K bytes (seg0:changlei).
Statement statistics:
  Memory used: 262144K bytes
Settings:  default_hash_table_bucket_number=6
Dispatcher statistics:
  executors used(total/cached/new connection): (1/1/0); dispatcher time(total/connection/dispatch data): (1.462 ms/0.000 ms/0.029 ms).
  dispatch data time(max/min/avg): (0.029 ms/0.029 ms/0.029 ms); consume executor data time(max/min/avg): (0.012 ms/0.012 ms/0.012 ms); free executor time(max/min/avg): (0.000 ms/0.000 ms/0.000 ms).
Data locality statistics:
  data locality ratio: 1.000; virtual segment number: 1; different host number: 1; virtual segment number per host(avg/min/max): (1/1/1); segment size(avg/min/max): (56.000 B/56 B/56 B); segment size with penalty(avg/min/max): (56.000 B/56 B/56 B); continuity(avg/min/max): (1.000/1.000/1.000); DFS metadatacache: 0.049 ms; resource allocation: 0.612 ms; datalocality calculation: 0.085 ms.
Total runtime: 13.398 ms
(20 rows)
上一篇:OushuDB 体系架构概览


下一篇:sqlserver数据库NULL类型注意事项