Greenplum 扩容二
准备阶段
0 明确信息
CentOS release 6.3
greenplum 4.3.x
3个计算节点的系统,测试需要停服十分钟
1 准备扩容相关文件(root)
vim home/gpadmin/expand_hosts
注意填写的一定是登录主机后,hostname的结果
kkk-dba-tmp-gp-s04.kkk
kkk-dba-tmp-gp-s05.kkk
vim home/gpadmin/existing_hosts_file
kkk-dba-tmp-gp-m00.kkk
kkk-dba-tmp-gp-m01.kkk
kkk-dba-tmp-gp-s01.kkk
kkk-dba-tmp-gp-s02.kkk
kkk-dba-tmp-gp-s03.kkk
vim home/gpadmin/all_hosts_new
kkk-dba-tmp-gp-m00.kkk
kkk-dba-tmp-gp-m01.kkk
kkk-dba-tmp-gp-s01.kkk
kkk-dba-tmp-gp-s02.kkk
kkk-dba-tmp-gp-s03.kkk
kkk-dba-tmp-gp-s04.kkk
kkk-dba-tmp-gp-s05.kkk
2 查看/etc/hosts文件(root)
3 检查新加入机器参数是否和集群已有机器计算节点的以下参数是否正确且一致(root)
gpcheck -f expand_hosts
内核版本 uname -a
操作系统 lsb_release -a
内存 free -m
cpu
cat /proc/cpuinfo| grep "processor"| wc -l
cat /proc/cpuinfo | grep model | sort | uniq
磁盘 df -h
4 建立信任关系(root)
source /home/greenplum_5500/greenplum_path.sh
gpssh-exkeys -e /home/gpadmin/existing_hosts_file -x /home/gpadmin/expand_hosts
5 新机器创建数据目录(root)
gpssh -f /home/gpadmin/expand_hosts
mkdir -p /home/ssd1/data1/primary/
mkdir -p /home/ssd1/data1/mirror/
mkdir -p /home/ssd1/data2/primary/
mkdir -p /home/ssd1/data2/mirror/
mkdir -p /home/ssd2/data3/primary/
mkdir -p /home/ssd2/data3/mirror/
mkdir -p /home/ssd2/data4/primary/
mkdir -p /home/ssd2/data4/mirror/
chown gpadmin:gpadmin -R /home/ssd1
chown gpadmin:gpadmin -R /home/ssd2
6 确认磁盘I/O和内存带宽(root)
gpcheckperf -f /home/gpadmin/expand_hosts -d /home/ssd1 -d /home/ssd2 -v -D –V
7 检查时钟同步(root)
gpssh -f /home/gpadmin/all_hosts_new
=>date
service ntpd start
chkconfig ntpd on
8 初始化新机器(配置系统环境)(root)
1、修改/etc/sysctl.conf
gpscp -f /home/gpadmin/expand_hosts /etc/sysctl.conf =:/etc/sysctl.conf
gpssh -f /home/gpadmin/expand_hosts "sysctl -p"
2、拷贝/etc/security/limits.conf
gpscp -f /home/gpadmin/expand_hosts /etc/security/limits.conf =:/etc/security/limits.conf
gpscp -f /home/gpadmin/expand_hosts /etc/security/limits.d/90-nproc.conf =:/etc/security/limits.d/90-nproc.conf
3、强写当前磁盘调度策略(ssd不需要)
gpssh -f /home/gpadmin/expand_hosts "echo deadline > /sys/block/sda/queue/scheduler"
4、设置预读大小
gpssh -f /home/gpadmin/expand_hosts "/sbin/blockdev --setra 16384 /dev/sda"
/dev/sda为数据目录,如果有多个,需要进行多次
5、拷贝/etc/rc.local(ssd不需要)
gpscp -f /home/gpadmin/expand_hosts /etc/rc.local =:/etc/rc.local
9 新机器部署GP(root)
gpseginstall -f /home/gpadmin/expand_hosts -u gpadmin -p gpadmin12222m
10 其余准备工作(root)
gpscp -f /home/gpadmin/expand_hosts /home/gpadmin/.bashrc =:/home/gpadmin/.bashrc
gpssh -f /home/gpadmin/expand_hosts "chown gpadmin:gpadmin -R /home/greenplum_5500; ln -s /home/greenplum_5500 /home/gpadmin/greenplum-db;chown gpadmin:gpadmin -R /home/gpadmin"
12 数据库状态检查(gpadmin)
通过gpstate -m命令检查是否有实例处于down的状态,数据库扩容前需要进行修复操作。
通过gpstate -f命令检查是否Standby Master与master之间的同步已失效,数据库扩容前需要恢复同步。
13 检查旧集群元数据是否有问题(gpadmin)
gpcheckcat -O -v
11 键处理(gpadmin)
-- tpch数据库需替换成业务数据库名称
psql -d tpch -c "select * from pg_indexes where indexdef like 'CREATE UNIQUE INDEX%' and tablename not like 'pg_%' and tablename not like 'gp_%';"
--导出所有的元数据信息
pg_dump -s -f filename.sql --gp-syntax database_name
--删除键
ALTER TABLE table_name DROP INDEX index_name;
扩容操作
1 创建expand库(gpadmin)
psql -c "create database expand;"
2 生成扩容文件(gpadmin)
gpexpand -f /home/gpadmin/expand_hosts -D expand
根据提示按需输入:
y
grouped
3 扩容(gpadmin)
好先封闭入口,避免一些不可预知流量使得各个节点的元数据不一致问题。
vi pg_hba.conf
适当位置添加
host all all 0.0.0.0/0 reject
gpstop -u
gpexpand -i {上一步输出文件} -D expand
在master上ps -fe | grep postgre和ps -fe | grep gpseg
连续ps查看,如果发现有gpseg在同一台机器上一直hang住,则可登录到对应的机器上具体的gpseg查看log
如果发现命令在某个机器上hang住,则登录到该机器上,检查每个primary下的日志,如果有日志显示一直等待mirror的相关信息,则登录到对应mirror机器上,检查该mirror的主端口和复制端口是否被占用。
回退阶段
1 扩容回滚(gpadmin)
后期处理
1 检查机器是否扩容成功(gpadmin)
psql -c "select hostname from gp_segment_configuration group by hostname;"
psql -c "select * from gp_segment_configuration where status != 'u' or role != preferred_role or mode != 's';"
psql -c "select * from gp_segment_configuration order by content asc,dbid;"
2 恢复服务(gpadmin)
3 数据重分布(gpadmin)
select localoid::regclass, * from gp_distribution_policy ;
如下,我们可以控制各个表的重分布优先级级别(rank=1优先级高,rank=10优先级低)
UPDATE gpexpand.status_detail SET rank=1 WHERE fq_name = 'public.lineitem';
UPDATE gpexpand.status_detail SET rank=1 WHERE fq_name = 'public.orders';
选择较大表(如>=400G)优先级更高,并发度为1进行
./big_tables_redistibuted.sh /home/greenplum_5500/ 127.0.0.1 5500 gpadmin 4 expand
$1 greenplum安装目录
$2 连接ip
$3 连接端口
$4 连接账户
$5 定义大表的界限(单位G)
$6 进行扩容进度控制的数据库
其余表(小于400G )并发度为2进行
gpexpand -a -D expand -n 2 -d 01:30:00
-d 指定重分布的进程跑多久,时间达到就停止
-e 指定重分布的进程跑到哪个时刻,跑到该时刻就停止
-n 指定并发数
-a 在重分布完数据后运行ANALYZE更新表的统计信息,默认是不运行ANALYZE
SELECT * FROM gpexpand.expansion_progress;
SELECT * FROM gpexpand.status;
4 恢复键(gpadmin)
5 统计信息更新(gpadmin)
6 元数据一致性验证(gpadmin)
gpcheckcat -O -v
7 删除expand数据库(gpadmin)
psql -c "drop database expand;"
可查杀相关任务后,再进行操作
完整扩容打印信息
[gpadmin@kkk-dba-tmp-gp-m00.kkk.baidu.com ~]$ gpexpand -i gpexpand_inputfile_20200122_121555 -D expand
20200122:14:26:18:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.99.00 build dev'
20200122:14:26:18:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.99.00 build dev) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4) compiled on Aug 30 2016 14:29:24'
20200122:14:26:18:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Querying gpexpand schema for current expansion state
20200122:14:26:18:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Readying Greenplum Database for a new expansion
#此时将关闭所有实例
20200122:14:26:55:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Checking database tpch for unalterable tables...
20200122:14:26:55:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Checking database postgres for unalterable tables...
20200122:14:26:55:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Checking database expand for unalterable tables...
20200122:14:26:55:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Checking database template1 for unalterable tables...
20200122:14:26:55:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Checking database tpch for tables with unique indexes...
20200122:14:26:55:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Checking database postgres for tables with unique indexes...
20200122:14:26:55:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Checking database expand for tables with unique indexes...
20200122:14:26:55:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Checking database template1 for tables with unique indexes...
20200122:14:26:55:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Tables with unique indexes exist. Until these tables are successfully
20200122:14:26:55:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-redistributed, unique constraints may be violated. For more information
20200122:14:26:55:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-on this issue, see the Greenplum Database Administrator Guide
Would you like to continue with System Expansion Yy|Nn (default=N):
> y
20200122:14:27:41:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Syncing Greenplum Database extensions
20200122:14:27:42:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-The packages on kkk-dba-tmp-gp-s04.kkk.baidu.com are consistent.
20200122:14:27:44:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-The packages on kkk-dba-tmp-gp-s05.kkk.baidu.com are consistent.
20200122:14:27:45:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-The packages on kkk-dba-tmp-gp-s06.kkk.baidu.com are consistent.
20200122:14:27:46:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-The packages on kkk-dba-tmp-gp-s07.kkk.baidu.com are consistent.
20200122:14:27:48:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Creating segment template
20200122:14:27:48:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-VACUUM FULL on the catalog tables
20200122:14:27:50:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Starting copy of segment dbid 1 to location /home/ssd1/data0/master/gpexpand_01222020_13144
20200122:14:27:50:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Copying postgresql.conf from existing segment into template
20200122:14:27:51:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Copying pg_hba.conf from existing segment into template
20200122:14:27:52:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Adding new segments into template pg_hba.conf
20200122:14:27:52:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Creating schema tar file
20200122:14:27:55:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Distributing template tar file to new hosts
20200122:14:27:59:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Configuring new segments (primary)
20200122:14:28:09:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Configuring new segments (mirror)
20200122:14:28:16:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Backing up pg_hba.conf file on original segments
20200122:14:28:18:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Copying new pg_hba.conf file to original segments
20200122:14:28:19:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Configuring original segments
20200122:14:28:19:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Cleaning up temporary template files
20200122:14:28:21:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Starting Greenplum Database in restricted mode
20200122:14:28:39:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Stopping database
20200122:14:29:15:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Checking if Transaction filespace was moved
20200122:14:29:15:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Checking if Temporary filespace was moved
20200122:14:29:15:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Configuring new segment filespaces
20200122:14:29:15:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Cleaning up databases in new segments.
20200122:14:29:15:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Starting master in utility mode
20200122:14:29:17:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Stopping master in utility mode
20200122:14:31:08:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Starting Greenplum Database in restricted mode
20200122:14:31:45:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Creating expansion schema
20200122:14:31:59:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Populating gpexpand.status_detail with data from database tpch
20200122:14:32:21:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Populating gpexpand.status_detail with data from database postgres
20200122:14:32:43:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Populating gpexpand.status_detail with data from database expand
20200122:14:33:04:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Populating gpexpand.status_detail with data from database template1
20200122:14:33:24:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Stopping Greenplum Database
20200122:14:36:09:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Starting Greenplum Database
20200122:14:36:30:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Starting new mirror segment synchronization
20200122:14:37:28:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-************************************************
20200122:14:37:28:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Initialization of the system expansion complete.
20200122:14:37:28:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-To begin table expansion onto the new segments
20200122:14:37:28:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-rerun gpexpand
20200122:14:37:28:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-************************************************
20200122:14:37:28:013144 gpexpand:kkk-dba-tmp-gp-m00:gpadmin-[INFO]:-Exiting...
相关文章