Greenplum6 数据库数据库学习_基本语法

Greenplum6 数据库

Greenplum6 数据库分布

  1. hash分布
  2. 随机分布
  3. 复制分布

基本语法介绍

  1. 获取语法
[gpadmin@mdw logs]$ psql
psql (9.4.24)
Type "help" for help.

postgres=# \h
Available help:
  ABORT                            ALTER TEXT SEARCH TEMPLATE       CREATE RESOURCE GROUP            DROP FUNCTION                    LOAD
  ALTER AGGREGATE                  ALTER TRIGGER                    CREATE RESOURCE QUEUE            DROP GROUP                       LOCK
  ALTER COLLATION                  ALTER TYPE                       CREATE ROLE                      DROP INDEX                       MOVE
  ALTER CONVERSION                 ALTER USER                       CREATE RULE                      DROP LANGUAGE                    NOTIFY
  ALTER DATABASE                   ALTER USER MAPPING               CREATE SCHEMA                    DROP MATERIALIZED VIEW           PREPARE
  ALTER DEFAULT PRIVILEGES         ALTER VIEW                       CREATE SEQUENCE                  DROP OPERATOR                    PREPARE TRANSACTION
  ALTER DOMAIN                     ANALYZE                          CREATE SERVER                    DROP OPERATOR CLASS              REASSIGN OWNED
  ALTER EVENT TRIGGER              BEGIN                            CREATE TABLE                     DROP OPERATOR FAMILY             REFRESH MATERIALIZED VIEW
  ALTER EXTENSION                  CHECKPOINT                       CREATE TABLE AS                  DROP OWNED                       REINDEX
  ALTER EXTERNAL TABLE             CLOSE                            CREATE TABLESPACE                DROP PROTOCOL                    RELEASE SAVEPOINT
  ALTER FOREIGN DATA WRAPPER       CLUSTER                          CREATE TEXT SEARCH CONFIGURATION DROP RESOURCE GROUP              RESET
  ALTER FOREIGN TABLE              COMMENT                          CREATE TEXT SEARCH DICTIONARY    DROP RESOURCE QUEUE              REVOKE
  ALTER FUNCTION                   COMMIT                           CREATE TEXT SEARCH PARSER        DROP ROLE                        ROLLBACK
  ALTER GROUP                      COMMIT PREPARED                  CREATE TEXT SEARCH TEMPLATE      DROP RULE                        ROLLBACK PREPARED
  ALTER INDEX                      COPY                             CREATE TRIGGER                   DROP SCHEMA                      ROLLBACK TO SAVEPOINT
  ALTER LANGUAGE                   CREATE AGGREGATE                 CREATE TYPE                      DROP SEQUENCE                    SAVEPOINT
  ALTER LARGE OBJECT               CREATE CAST                      CREATE USER                      DROP SERVER                      SECURITY LABEL
  ALTER MATERIALIZED VIEW          CREATE COLLATION                 CREATE USER MAPPING              DROP TABLE                       SELECT
  ALTER OPERATOR                   CREATE CONVERSION                CREATE VIEW                      DROP TABLESPACE                  SELECT INTO
  ALTER OPERATOR CLASS             CREATE DATABASE                  DEALLOCATE                       DROP TEXT SEARCH CONFIGURATION   SET
  ALTER OPERATOR FAMILY            CREATE DOMAIN                    DECLARE                          DROP TEXT SEARCH DICTIONARY      SET CONSTRAINTS
  ALTER PROTOCOL                   CREATE EVENT TRIGGER             DELETE                           DROP TEXT SEARCH PARSER          SET ROLE
  ALTER RESOURCE GROUP             CREATE EXTENSION                 DISCARD                          DROP TEXT SEARCH TEMPLATE        SET SESSION AUTHORIZATION
  ALTER RESOURCE QUEUE             CREATE EXTERNAL TABLE            DO                               DROP TRIGGER                     SET TRANSACTION
  ALTER ROLE                       CREATE FOREIGN DATA WRAPPER      DROP AGGREGATE                   DROP TYPE                        SHOW
  ALTER RULE                       CREATE FOREIGN TABLE             DROP CAST                        DROP USER                        START TRANSACTION
  ALTER SCHEMA                     CREATE FUNCTION                  DROP COLLATION                   DROP USER MAPPING                TABLE
  ALTER SEQUENCE                   CREATE GROUP                     DROP CONVERSION                  DROP VIEW                        TRUNCATE
  ALTER SERVER                     CREATE INDEX                     DROP DATABASE                    END                              UNLISTEN
  ALTER SYSTEM                     CREATE LANGUAGE                  DROP DOMAIN                      EXECUTE                          UPDATE
  ALTER TABLE                      CREATE MATERIALIZED VIEW         DROP EVENT TRIGGER               EXPLAIN                          VACUUM
  ALTER TABLESPACE                 CREATE OPERATOR                  DROP EXTENSION                   FETCH                            VALUES
  ALTER TEXT SEARCH CONFIGURATION  CREATE OPERATOR CLASS            DROP EXTERNAL TABLE              GRANT                            WITH
  ALTER TEXT SEARCH DICTIONARY     CREATE OPERATOR FAMILY           DROP FOREIGN DATA WRAPPER        INSERT                           
  ALTER TEXT SEARCH PARSER         CREATE PROTOCOL                  DROP FOREIGN TABLE               LISTEN                           
postgres=#  
  1. 创建数据库
    
    [gpadmin@mdw logs]$ createdb testDB -E utf-8
    [gpadmin@mdw logs]$ psql -h 10.10.10.101 -p 5432 -d testDB -U gpadmin
    psql (9.4.24)
    Type "help" for help.

testDB=# \q
[gpadmin@mdw logs]$ export PGDATABASE=testDB
[gpadmin@mdw logs]$ psql
psql (9.4.24)
Type "help" for help.

testDB=#

3. 建表语句
  • GreenPlum中创建表需要指定表的分布键。
  • 如果表需要用某个字段分区,可以通过partition by 将表建成分区表。
  • 可以使用like操作创建与like的表一样结构的表,功能类似create table t1 as select * from t2 limit 0。
  • 可以使用inherits实现表的继承,具体实现参考postgresql文档。

--语法查询
\h create table
--创建表
create table test001(id int,name varchar(128)); --id 为分布键
create table test002(id int,name varchar(128)) distributed by (id); --同上

testDB=# create table test001(id int,name varchar(128));
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
testDB=# create table test002(id int,name varchar(128)) distributed by (id);
CREATE TABLE
testDB=#

create table test003(id int,name varchar(128)) distributed by (id,name) --多个分布键
testDB=# create table test003(id int,name varchar(128)) distributed by (id,name);
CREATE TABLE

create table test004(id int,name varchar(128)) distributed randomly; --随机分布键
testDB=# create table test004(id int,name varchar(128)) distributed randomly;
CREATE TABLE

create table test005(id int primary key ,name varchar(128));
create table test006(id int unique ,name varchar(128));

testDB=# create table test005(id int primary key ,name varchar(128));
CREATE TABLE
testDB=# create table test006(id int unique ,name varchar(128));
CREATE TABLE
testDB=#

采用随机分布策略的表默认将主键,或者唯一键作为分布键,因为每个Segment都是一个单一的数据库,单个数据库可以确保唯一性,多个数据库节点就无法保证全局的跨库唯一性,故只能按照唯一键分布,同一个值的数据都在一个节点上,以此来保证唯一性。

--如果指定的分布键与主键盘不一样,那么分布键会被更改为主键。在greenplum6 中这句话貌似显示不兼容,如下:

create table test007(id int unique,name varchar(128)) distributed by (id,name);

testDB=# create table test007(id int unique,name varchar(128)) distributed by (id,name);
ERROR: UNIQUE constraint and DISTRIBUTED BY definitions are incompatible
HINT: When there is both a UNIQUE constraint and a DISTRIBUTED BY clause, the DISTRIBUTED BY clause must be a subset of the UNIQUE constraint.
testDB=#
按照它的要求进行修改,我们distribute by 修改成id
testDB=# create table test007(id int unique,name varchar(128)) distributed by (id);
CREATE TABLE
testDB=#

---创建一模一样的 表
create table test_like (like test001);

--使用like 创建表的时候,只是表结构会跟原表一摸一样,表的特殊属性并不会一样,例如压缩,只增(appendonly)等属性,如果不指定分布键,默认分布键与原表一致。

testDB=# create table test_like (like test001);
NOTICE: table doesn't have 'DISTRIBUTED BY' clause, defaulting to distribution columns from LIKE table
CREATE TABLE
testDB=#

4. 插入语句 

插入语句
执行insert语句注意分布键不要为空,否则分布键默认变成null',数据都被保存到一个节点上会导致分布不均

insert into test001 values (100,'tom'),(101,'lily'),(102,'jack'),(103,'linda');
insert into test002 values (200,'tom'),(101,'lily'),(202,'jack'),(103,'linda');

5. 更新语句

不能批量对分布键执行update,因为分布键执行update需要将数据重分布.

testDB=# update test002 set id=203 where id=202;
UPDATE 1
testDB=#

6. 删除语句delete--truncate

在Greenplum 3.x 的版本中,如果delete 操作涉及子查询,子查询的结果还涉及到数据重分布,这样的删除语句会报错,Greenplum 4.x以上,支持该操作。
testDB=# delete from test001 where name in (select name from test002);
DELETE 4
testDB=#

如果对整张表执行delete会比较慢,建议使用truncate.

truncate执行truncate直接删除表的物理文件,然后创建新的数据文件。如果有sql正在操作这张表,那么truncate会被锁住,直到表上面的所有锁会被释放。

7. 查询语句

testDB=# select * from test001 x,test002 y where x.id=y.id;
id | name | id | name
-----+-------+-----+-------
103 | linda | 103 | linda
101 | lily | 101 | lily
(2 rows)

8. 执行计划

testDB=# explain select * from test001 x,test002 y where x.id=y.id;
QUERY PLAN

Gather Motion 6:1 (slice1; segments: 6) (cost=0.00..862.00 rows=5 width=18)
-> Hash Join (cost=0.00..862.00 rows=1 width=18)
Hash Cond: (test001.id = test002.id)
-> Seq Scan on test001 (cost=0.00..431.00 rows=1 width=9)
-> Hash (cost=431.00..431.00 rows=1 width=9)
-> Seq Scan on test002 (cost=0.00..431.00 rows=1 width=9)
Optimizer: Pivotal Optimizer (GPORCA)
(7 rows)

上一篇:truncate/delete/drop 区别


下一篇:【错误解决】之“OperationalError: (sqlite3.OperationalError) table ** has no column named **”