os: ubuntu 16.04
postgresql: 9.6.8
citus: postgresql-9.6-citus 8.0.0

ip规划如下：

192.168.0.92 pgsql1 --coordinator 节点

192.168.0.90 pgsql2 --worker 节点
192.168.0.88 pgsql3 --worker 节点

现添加一个节点
192.168.0.86 pgsql4 --worker 节点

准备pgsql4节点

准备os，准备postgresql，准备 citus，创建用户及数据库，创建extension。
可以参照前面的blog，详细过程略。

coordinator 节点上添加 worker节点

citusdb=# select * from master_get_active_worker_nodes();
  node_name   | node_port 
--------------+-----------
 192.168.0.88 |      5432
 192.168.0.90 |      5432
(2 rows)

citusdb=# select * from master_add_node('192.168.0.86',5432);
NOTICE:  Replicating reference table "ref_t0" to the node 192.168.0.86:5432
NOTICE:  Replicating reference table "ref_t1" to the node 192.168.0.86:5432
 nodeid | groupid |   nodename   | nodeport | noderack | hasmetadata | isactive | noderole | nodecluster 
--------+---------+--------------+----------+----------+-------------+----------+----------+-------------
      3 |       3 | 192.168.0.86 |     5432 | default  | f           | t        | primary  | default
(1 row)
citusdb=# 
citusdb=# select * from master_get_active_worker_nodes();
  node_name   | node_port 
--------------+-----------
 192.168.0.88 |      5432
 192.168.0.86 |      5432
 192.168.0.90 |      5432
(3 rows)

citusdb=#

使用master_add_node时，对于已有的reference table，马上就会复制到新加的节点上；对于已有的distributed table，不会有任何变化。

coordinator 重新平衡 distributed table

添加新的worker后，肯定希望distributed table 能够分一部分数据到新的worker上。
执行 rebalance_table_shards ，发现没有这个函数。

citusdb=# select rebalance_table_shards('tmp_t0');

citusdata 官方文档描述如下：

Citus Enterprise and Citus Cloud provide a rebalance_table_shards function to make it easier. 
This function will move the shards of a given table to distribute them evenly among the workers.

悲催了，只有 Enterprise 和 Cloud 才提供功能，社区版不提供。
只能手动操作了。

复制分片

–分布信息

citusdb=# select * from pg_dist_placement
where shardid in (select shardid from pg_dist_shard where logicalrelid='tmp_t0'::regclass);

 placementid | shardid | shardstate | shardlength | groupid 
-------------+---------+------------+-------------+---------
          71 |  102074 |          1 |           0 |       2
          72 |  102075 |          1 |           0 |       1

–获取worker节点

citusdb=# select * from pg_dist_node;
 nodeid | groupid |   nodename   | nodeport | noderack | hasmetadata | isactive | noderole | nodecluster 
--------+---------+--------------+----------+----------+-------------+----------+----------+-------------
      1 |       1 | 192.168.0.90 |     5432 | default  | f           | t        | primary  | default
      2 |       2 | 192.168.0.88 |     5432 | default  | f           | t        | primary  | default
      4 |       4 | 192.168.0.86 |     5432 | default  | f           | t        | primary  | default
(3 rows)

如果是 postgresql 10 及以上，通过逻辑复制可以快速实现。
之前版本申请停机运维后可以使用pg_dump。
一定要确保要移动的表没有数据变化。

pgsql1节点上导出worker1，worker2的指定表

$ pg_dump -U cituser -h 192.168.0.90 -t tmp_t0_102075 -b -v -f /tmp/tmp_t0_102075.sql citusdb 
$ pg_dump -U cituser -h 192.168.0.88 -t tmp_t0_102074 -b -v -f /tmp/tmp_t0_102074.sql citusdb

pgsql4节点上导入指定表

$ psql -h 192.168.0.86 -U cituser citusdb < /tmp/tmp_t0_102074.sql
$ psql -h 192.168.0.86 -U cituser citusdb < /tmp/tmp_t0_102075.sql

元数据修改

pgsql1节点上锁定元数据

citusdb=# update pg_dist_placement set groupid=4 where shardid=102075 and groupid=1;
citusdb=# update pg_dist_placement set groupid=4 where shardid=102074 and groupid=2;

删除分片

需要删除复制源分片。
pgsql2节点删除分片

citusdb=# drop table tmp_t0_102075;

pgsql3节点删除分片

citusdb=# drop table tmp_t0_102074;

验证

pgsql1 节点上查询

citusdb=# select count(1) from tmp_t0;
  count  
---------
 2000000
(1 row)

参考：
https://www.citusdata.com/
https://docs.citusdata.com/en/v8.0/
https://docs.citusdata.com/en/stable/index.html

citus 之四 add node + pg_dump rebalance