greenplum集群安装配置

今天安装了一个GP集群测试环境,反复折腾了好几次,安装过程中还是有一些关键点需要详细记录和说明一下,以备后续使用时参考,如果该笔记能够在你需要的时候有所帮助,本人不胜荣幸,不完美的地方也请谅解。

1、环境准备

这里准备了3台机器,1台master、2台存储

OS: centos7.5
GP: greenplum-db-6.16.3
JDK: 1.8+

# 集群主机信息
192.168.101.203 hdp1 (segment master)
192.168.101.204 hdp2 (segment)
192.168.101.205 hdp3 (segment)

greenplum 下载地址:https://github.com/greenplum-db/gpdb/releases

2、 系统参数配置

2.1 修改各个节点名称

修改主机名称信息主要是为了免密及集群建相互通信

[root@hdp1 greenplum-db]$ vi /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
# 配置主机名到IP的映射
192.168.101.203 hdp1
192.168.101.204 hdp2
192.168.101.205 hdp3

首先ping下主机名称,查看配置信息是否生效,通过命令将hosts配置文件远程复制到其他服务器

scp /etc/hosts hdp2:/etc
2.2 修改linux系统内核参数
[root@hdp1 greenplum-db]$ vim /etc/sysctl.conf 

# sysctl settings are defined through files in
# /usr/lib/sysctl.d/, /run/sysctl.d/, and /etc/sysctl.d/.
#
# Vendors settings live in /usr/lib/sysctl.d/.
# To override a whole file, create a new file with the same in
# /etc/sysctl.d/ and put new settings there. To override
# only specific settings, add a file with a lexically later
# name in /etc/sysctl.d/ and put new settings there.
#
# For more information, see sysctl.conf(5) and sysctl.d(5).
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1

kernel.shmmax = 500000000
kernel.shmmni = 4096
kernel.shmall = 4000000000
kernel.sem = 500 1024000 200 4096
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.ip_local_port_range = 10000 65535
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.overcommit_memory = 2
vm.swappiness = 10
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
vm.dirty_background_ratio = 0
vm.dirty_ratio=0
vm.dirty_background_bytes = 1610612736
vm.dirty_bytes = 4294967296
2.3 修改linux最大限制
[root@hdp1 greenplum-db]$ vi /etc/security/limits.conf
# /etc/security/limits.conf
#
#This file sets the resource limits for the users logged in via PAM.
#It does not affect resource limits of the system services.
#
* soft nofile 65536
* hard nofile 65536
* soft nproc 131072
* hard nproc 131072
2.4 关闭所有节点防火墙

修改主节点/etc/selinux/config文件,其他子节点可以通过scp远程复制

[root@hdp1 greenplum-db]# vi /etc/selinux/config 
# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
SELINUX=disabled
# SELINUXTYPE= can take one of three two values:
#     targeted - Targeted processes are protected,
#     minimum - Modification of targeted policy. Only selected processes are protected.
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted
2.5 复制主节点配置

依次复制到hdp2、hdp3

scp /etc/sysctl.conf hdp2:/etc
scp /etc/security/limits.conf hdp2:/etc/security
scp /etc/selinux/config hdp2:/etc/selinux

最后让节点配置生效,在每个节点依次执行如下命令

[root@hdp1 greenplum-db]# sysctl -p(让配置生效)
2.6 在所有节点上创建gpadmin用户和组
groupadd -g 530 gpadmin
useradd -g 530 -u530 -m -d /home/gpadmin -s /bin/bash gpadmin
chown -R gpadmin:gpadmin /home/gpadmin/
passwd gpadmin
2.7 设置gpadmin用户换机

修改 .bashrc和 .bash_profile 文件,在其最后添加环境配置信息

source /usr/local/greenplum-db/greenplum_path.sh 
export MASTER_DATA_DIRECTORY=/data1/gpdata/master/gpseg-1

3、集群安装

3.1 上传安装文件至主节点

下载greenplum-db-6.16.3安装包,下载地址为:https://github.com/greenplum-db/gpdb/releases/download/6.16.3/open-source-greenplum-db-6.16.3-rhel7-x86_64.rpm,由于是RPM默认安装,其默认安装目录为: /usr/local/greenplum/下,修改该路径gpadmin操作权限:

chown -R gpadmin:gpadmin /usr/local

用户也可通过rpm命令参数自定义安装路径,这里不再赘述。
将rpm安装包文件上传至主节点/home/gpadmin目录下。

3.2 创建配置集群

创建一个hostlist,包含所有节点主机名

su - gpadmin
mkdir -p /home/gpadmin/conf
vi /home/gpadmin/conf/hostlist
 
hdp1
hdp2
hdp3

创建一个 seg_hosts ,包含所有的Segment Host的主机名

vi /home/gpadmin/conf/seg_hosts
 
hdp2
hdp3
3.3 配置ssh免密连接
[root@ gp-master ~]# su - gpadmin
[gpadmin@ gp-master ~]# source /usr/local/greenplum-db/greenplum_path.sh  
[gpadmin@ gp-master ~]# gpssh-exkeys -f /home/gpadmin/conf/hostlist
 
[STEP 1 of 5] create local ID and authorize on local host
  ... /home/gpadmin/.ssh/id_rsa file exists ... key generation skipped
 
[STEP 2 of 5] keyscan all hosts and update known_hosts file
 
[STEP 3 of 5] authorize current user on remote hosts
  ... send to hdp2
  ... send to hdp3
#提示:这里提示输入各个子节点gpadmin用户密码
[STEP 4 of 5] determine common authentication file content
 
[STEP 5 of 5] copy authentication files to all remote hosts
  ... finished key exchange with hdp2
  ... finished key exchange with hdp3
 
[INFO] completed successfully

测试免密连接是否成功:

[root@ gp-master ~]# ssh gp-sdw1  #不需要密码即可登录;

或者

[root@ gp-master ~]# gpssh -f /home/gpadmin/conf/hostlist
 
=> pwd
[hdp1] /home/gpadmin
[hdp3] /home/gpadmin
[hdp2] /home/gpadmin
=> exit

出现上面结果就是成功了。

3.4 Segment节点上安装Greenplum

在各个子节点进行文件夹赋权:

chown -R gpadmin:gpadmin /usr/local
chown -R gpadmin:gpadmin /opt

在主节点打包安装包并复制到各个子节点:

[gpadmin@mdw conf]$ cd /usr/local/
# 打包
[gpadmin@mdw greenplum]$ tar -cf gp.tar greenplum-db-6.16.3/
# 在主节点使用下面命令批量远程复制到seg_hosts列表中的子节点中
[gpadmin@mdw greenplum]$ gpscp -f /home/gpadmin/conf/seg_hosts gp.tar =:/usr/local/

在子节点的相应文件夹查看,之后要将tar包解压,现在我们将采用对子节点使用批量解压操作:

[gpadmin@mdw conf]$ source /usr/local/ greenplum-db/greenplum_path.sh
[gpadmin@mdw conf]$ gpssh -f /home/gpadmin/conf/seg_hosts  #统一处理子节点
 
=> cd /usr/local
[hdp2]
[hdp3]
=> tar -xf gp.tar
[hdp2]
[hdp3]
 
#建立软链接
=> ln -s ./greenplum-db-6.16.3 greenplum-db
[hdp2]
[hdp3]
=> ll(可以使用ll查看一下是否已经安装成功)
=>exit(退出)

这样就完成了所有节点的安装。

4、环境变量配置

4.1 在主节点进行环境变量配置
vi /home/gpadmin/.bash_profile 在最后添加
 
source /usr/local/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/opt/greenplum/data/master/gpseg-1
export GPPORT=5432
export PGDATABASE=gp_sydb
4.2 然后依次复制到各个子节点
scp /home/gpadmin/.bash_profile hdp2:/home/gpadmin/
scp /home/gpadmin/.bash_profile hdp3:/home/gpadmin/
# 让环境变量生效
source .bash_profile
4.3 初始化前检查连通性
cd /usr/local/greenplum-db/bin
gpcheckperf -f /home/gpadmin/conf/hostlist -r N -d /tmp
 
--  NETPERF TEST
-------------------
 
====================
==  RESULT
====================
Netperf bisection bandwidth test
hdp1 -> hdp2 = 72.220000
hdp2 -> hdp3 = 21.470000
hdp3 -> hdp1 = 43.510000
 
Summary:
sum = 181.40 MB/sec
min = 21.47 MB/sec
max = 72.22 MB/sec
avg = 45.35 MB/sec
median = 44.20 MB/sec

出现以上内容证明各个节点已经可以连通。

4.4 执行初始化

初始化 Greenplum 配置文件模板都在/usr/local/greenplum-db/docs/cli_help/gpconfigs目录下,gpinitsystem_config是初始化 Greenplum 的模板,此模板中 Mirror Segment的配置都被注释;创建一个副本,对其以下配置进行修改:

cp /usr/local/greenplum-db/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/gpconfigs/gpinitsystem_config
vi /home/gpadmin/gpconfigs/gpinitsystem_config

# FILE NAME: gpinitsystem_config

# Configuration file needed by the gpinitsystem

################################################
#### REQUIRED PARAMETERS
################################################

#### Name of this Greenplum system enclosed in quotes.
#配置的初始化数据库名称
ARRAY_NAME="gp_sydb"

#### Naming convention for utility-generated data directories.
SEG_PREFIX=gp_seg

#### Base number by which primary segment port numbers
#### are calculated.
PORT_BASE=6000

#### File system location(s) where primary segment data directories
#### will be created. The number of locations in the list dictate
#### the number of primary segments that will get created per
#### physical host (if multiple addresses for a host are listed in
#### the hostfile, the number of segments will be spread evenly across
#### the specified interface addresses). 
#资源目录为上面章节创建的资源目录,配置几次资源目录就是每个子节点有几个实例(推荐4-8个,这里配置了6个,primary与mirror文件夹个数对应)
declare -a DATA_DIRECTORY=(/home/gpadmin/data/data_hdp2 /home/gpadmin/data/data_hdp3)

#### OS-configured hostname or IP address of the master host.
#主节点名称
MASTER_HOSTNAME=hdp1

#### File system location where the master data directory
#### will be created.
MASTER_DIRECTORY=/home/gpadmin/data/mdata_hdp1

#### Port number for the master instance.
MASTER_PORT=5432

#### Shell utility used to connect to remote hosts.
TRUSTED_SHELL=ssh

#### Maximum log file segments between automatic WAL checkpoints.
CHECK_POINT_SEGMENTS=8

#### Default server-side character set encoding.
ENCODING=UNICODE

执行初始化

gpinitsystem -c initgp_config -S

【注意】若初始化失败,需要删除/home/gpadmin/data下的数据资源目录重新初始化

5 数据库操作

5.1 停止和启动集群
gpstop -M fast
gpstart -a
5.2 登录数据库
$ psql -d postgres  #进入某个数据库
 
postgres=# \l # 查询数据库
                 List of databases
   Name    |  Owner  | Encoding |  Access privileges  
-----------+---------+----------+---------------------
 gp_sydb   | gpadmin | UTF8     | 
 postgres  | gpadmin | UTF8     | 
 template0 | gpadmin | UTF8     | =c/gpadmin          
                                : gpadmin=CTc/gpadmin
 template1 | gpadmin | UTF8     | =c/gpadmin          
                                : gpadmin=CTc/gpadmin
(4 rows)
postgres=# \i test.sql #执行sql
postgres=# copy 表名 to '/tmp/1.csv' with 'csv';      #快速导出单表数据
postgres=# copy 表名 from '/tmp/1.csv' with 'csv';    #快速导入单表数据
postgres=# \q          #退出数据库
5.2 集群状态
gpstate -e #查看mirror的状态
gpstate -f #查看standby master的状态
gpstate -s #查看整个GP群集的状态
gpstate -i #查看GP的版本
gpstate --help #帮助文档,可以查看gpstate更多用法

目前为止数据库已经操作完毕。默认只有本地可以连数据库,如果需要别的I可以连,需要修改gp_hba.conf文件。

5.3 远程登录

使用pgadmin远程连接 greenplum 时,出现Unable to connect to server: FATAL: no pg_hba.conf entry for host “xxx.xxx.xx.xxx”, user “postgres”, database “postgres” 的问题。

原因

PostgreSQL数据库为了安全,它不会监听除本地以外的所有连接请求,当用户通过jdbc或者客户端软件去远程访问时,就会报出此问题。

客户端访问和认证受到配置文件pg_hub.conf(标准的PostgreSQL基于主机认证文件)的控制。

在Greenplun数据库中,Master实例的pg_hba.conf 文件控制着对Greenplum系统的客户端访问和认证。Segment也有自己的pg_hba.conf 文件,但是它们已经被正确地配置为仅允许来自Master主机的客户端连接。Segment从不接收外部的客户端连接,因此没有修改Segment上的pg_hba.conf 文件。

vi /home/gpadmin/data/mdata_hdp1/gp_seg-1/pg_hba.conf
#
# If you want to allow non-local connections, you need to add more
# "host" records.  In that case you will also need to make PostgreSQL
# listen on a non-local interface via the listen_addresses
# configuration parameter, or via the -i or -h command line switches.
# CAUTION: Configuring the system for local "trust" authentication
# allows any local user to connect as any PostgreSQL user, including
# the database superuser.  If you do not trust all your local users,
# use another authentication method.
# TYPE  DATABASE        USER            ADDRESS                 METHOD
# "local" is for Unix domain socket connections only
# IPv4 local connections:
# IPv6 local connections:
# Allow replication connections from localhost, by a user with the
# replication privilege.
#local   replication     gpadmin                                trust
#host    replication     gpadmin        127.0.0.1/32            trust
#host    replication     gpadmin        ::1/128                 trust
local    all         gpadmin         ident
host     all         gpadmin         127.0.0.1/28    trust
host     all         gpadmin         192.168.101.203/32       trust
host     all         gpadmin         192.168.61.10/32       trust
local    replication gpadmin         ident
host     replication gpadmin         samehost       trust
host     replication gpadmin         192.168.101.203/32       trust
host     replication gpadmin         192.168.61.10/32       trust
# 增加远程连接访问信息
host     all         all             0.0.0.0/0              trust
host     all         gpadmin         192.168.10.42/32       trust

切换到gpadmin,执行gpstop -u
备注:gpstop -u的意思是将pg_hba.conf文件重新上传并运行master中的postgresql.conf 文件的参数,期间不会中断服务。已建立连接的用户将不会使用最新配置,重连后将应用最新。起到了使刚刚修改的pg_hba.conf配置文件生效的作用。

5.4 greenplum常见问题及解决方法
5.4.1 错误:gpadmin-[CRITICAL]:-gpstate failed. (Reason=‘Environment Variable MASTER_DATA_DIRECTORY not set!’) exiting…

解决方法:

[gpadmin@bj-ksy-g1-mongos-01 ~]$ vim ~/.bashrc
# 添加:
MASTER_DATA_DIRECTORY=/opt/data/master/gpseg-1
export MASTER_DATA_DIRECTORY
5.4.2 错误:数据库初始化:gpinitsystem -c gpconfigs/gpinitsystem_config -h list
错误提示:
2018-08-29 16:51:01.338476 CST,,,p21229,th406714176,,,,0,,,seg-999,,,,,"FATAL","XX000","could not create semaphores: No space left on device (pg_sema.c:129)","Failed system call was semget(127, 17, 03600).","This error does *not* mean that you have run out of disk space.
It occurs when either the system limit for the maximum number of semaphore sets (SEMMNI), or the system wide maximum number of semaphores (SEMMNS), would be exceeded.  You need to raise the respective kernel parameter.  Alternatively, reduce PostgreSQL's consumption ofsemaphores by reducing its max_connections parameter (currently 753).
The PostgreSQL documentation contains more information about configuring your system for PostgreSQL.",,,,,,"InternalIpcSemaphoreCreate","pg_sema.c",129,1    0x95661b postgres errstart (elog.c:521)

解决办法:
[root@bj-ksy-g1-mongos-02 primary]# cat /proc/sys/kernel/sem
250 32000   32  128

修改kernel.sem为:
[root@bj-ksy-g1-mongos-02 primary]# cat /etc/sysctl.conf
kernel.sem = 250 512000 100 2048
5.4.3 错误 :执行检查:gpcheck -f list
错误提示:
XFS filesystem on device /dev/vdb1 is missing the recommended mount option 'allocsize=16m'

解决办法:
[gpadmin@bj-ksy-g1-mongos-01 ~]$ cat /etc/fstab
/dev/vdb1 /opt  xfs  defaults,allocsize=16348k,inode64,noatime        1 1
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 213,186评论 6 492
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 90,858评论 3 387
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 158,620评论 0 348
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 56,888评论 1 285
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,009评论 6 385
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,149评论 1 291
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,204评论 3 412
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 37,956评论 0 268
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,385评论 1 303
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,698评论 2 327
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,863评论 1 341
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,544评论 4 335
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,185评论 3 317
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,899评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,141评论 1 267
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,684评论 2 362
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,750评论 2 351

推荐阅读更多精彩内容