Sequence(序列)是数据库经常使用自增列属性,对于单机PostgreSQL实例,数据库维护一个自增变量即可。但是对于Greenplum的MPP架构,如果每个节点都维护自己的Sequence,那么Sequence将会出现重复,那么Greenplum是如何处理的呢?
如何使用Sequence
create table test_sequence(id serial, name text);
postgres=> \d
public | test_sequence | table | postgres
public | test_sequence_id_seq | sequence | postgres
postgres=> insert into test_sequence (name) values(1);
INSERT 0 1
postgres=> insert into test_sequence (name) values(2);
INSERT 0 1
postgres=> insert into test_sequence (name) values(3);
INSERT 0 1
postgres=> select * from test_sequence;
id | name
----+------
3 | 3
1 | 1
2 | 2
(3 rows)
Sequence是谁维护
查看Master的Sequence维护元信息
postgres=> select * from test_sequence_id_seq;
sequence_name | last_value | increment_by | max_value | min_value | cache_value | log_cnt | is_cycled | is_called
----------------------+------------+--------------+---------------------+-----------+-------------+---------+-----------+-----------
test_sequence_id_seq | 3 | 1 | 9223372036854775807 | 1 | 1 | 30 | f | t
(1 row)
查看Segment的Sequence维护元信息
postgres=> select * from gp_dist_random('test_sequence_id_seq');
sequence_name | last_value | increment_by | max_value | min_value | cache_value | log_cnt | is_cycled | is_called
----------------------+------------+--------------+---------------------+-----------+-------------+---------+-----------+-----------
test_sequence_id_seq | 1 | 1 | 9223372036854775807 | 1 | 1 | 1 | f | f
test_sequence_id_seq | 1 | 1 | 9223372036854775807 | 1 | 1 | 1 | f | f
test_sequence_id_seq | 1 | 1 | 9223372036854775807 | 1 | 1 | 1 | f | f
test_sequence_id_seq | 1 | 1 | 9223372036854775807 | 1 | 1 | 1 | f | f
(4 rows)
通过Master和Segment的元信息可以看出,只有Master一直更新元信息,Segment的Sequence元信息一直不变,所以Sequence是由Master维护的。
Sequence分配过程
如何所示,Master上有一个seqserver进程,专门用来维护全局的Sequence信息。所有的Segment获取最新的Sequence都需要向Master的seqserver请求,然后seqserver更新Sequence云信息,返回给Segment。为了实现Sequence,Master和Segment多了一次交互,这样会影响性能,建议应用层生成自增值。