Greenplum MADlib安装与卸载

2023-03-10 00:00:00 数据库 版本 安装 软件包 卸载

Apache MADlib是一个开源库,用于可扩展的数据库内分析。Greenplum MADlib 扩展提供了在 Greenplum 数据库中运行机器学习和深度学习工作负载的能力。


1. 安装 MADlib

1.1 安装MADlib软件包

从VMware Tanzu下载合适版本的MADlib 扩展包
上传软件包到Greenplum的Master主机
解压缩
$ tar xzvf madlib-1.18.0+2-gp6-rhel7-x86_64.tar.gz

通过运行gppkg命令安装软件包。例如:
[gpadmin@gpmdw opt]$ gppkg -i ./madlib-1.18.0+2-gp6-rhel7-x86_64/madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg
20220209:10:16:19:002504 gppkg:gpmdw:gpadmin-[INFO]:-Starting gppkg with args: -i ./madlib-1.18.0+2-gp6-rhel7-x86_64/madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg
20220209:10:16:19:002504 gppkg:gpmdw:gpadmin-[INFO]:-Installing package madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg
20220209:10:16:19:002504 gppkg:gpmdw:gpadmin-[INFO]:-Validating rpm installation cmdStr='rpm --test -i /usr/local/greenplum-db-6.19.1/.tmp/madlib-1.18.0-1.x86_64.rpm --dbpath /usr/local/greenplum-db-6.19.1/share/packages/database --prefix /usr/local/greenplum-db-6.19.1'
20220209:10:16:20:002504 gppkg:gpmdw:gpadmin-[INFO]:-Installing madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg locally
20220209:10:16:20:002504 gppkg:gpmdw:gpadmin-[INFO]:-Validating rpm installation cmdStr='rpm --test -i /usr/local/greenplum-db-6.19.1/.tmp/madlib-1.18.0-1.x86_64.rpm --dbpath /usr/local/greenplum-db-6.19.1/share/packages/database --prefix /usr/local/greenplum-db-6.19.1'
20220209:10:16:20:002504 gppkg:gpmdw:gpadmin-[INFO]:-Installing rpms cmdStr='rpm -i --force /usr/local/greenplum-db-6.19.1/.tmp/madlib-1.18.0-1.x86_64.rpm --dbpath /usr/local/greenplum-db-6.19.1/share/packages/database --prefix=/usr/local/greenplum-db-6.19.1'
20220209:10:16:21:002504 gppkg:gpmdw:gpadmin-[INFO]:-Completed local installation of madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg.
20220209:10:16:21:002504 gppkg:gpmdw:gpadmin-[INFO]:-Please run the following command to deploy MADlib
usage: madpack install [-s schema_name] -p greenplum -c user@host:port/database
Example:
$ $GPHOME/madlib/bin/madpack install -s madlib -p greenplum -c gpadmin@mdw:5432/testdb
This will install MADlib objects into a Greenplum database named "testdb"
running on server "mdw" on port 5432. Installer will try to login as "gpadmin"
and will prompt for password. The target schema will be "madlib".
To upgrade to a new version of MADlib from version v1.0 or later, use option "upgrade",
instead of "install"
For additional options run:
$ madpack --help
Release notes and additional documentation can be found at http://madlib.apache.org
20220209:10:16:21:002504 gppkg:gpmdw:gpadmin-[INFO]:-madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg successfully installed.

2. 将 MADlib 函数添加到数据库

安装 MADlib 包后,运行命令将 MADlib 函数添加到 Greenplum 数据库。 madpack位于 $GPHOME/madlib/bin目录中。

[gpadmin@gpmdw ~]$ $GPHOME/madlib/bin/madpack -s madlib -p greenplum -c gpadmin@gpmdw:5432/test install
madpack.py: INFO : Detected Greenplum DB version 6.19.1.
madpack.py: INFO : *** Installing MADlib ***
madpack.py: INFO : MADlib tools version = 1.18.0 (/usr/local/greenplum-db-6.19.1/madlib/Versions/1.18.0/bin/../madpack/madpack.py)
madpack.py: INFO : MADlib database version = None (host=gpmdw:5432, db=postgres, schema=madlib)
madpack.py: INFO : Testing PL/Python environment...
madpack.py: INFO : > PL/Python environment OK (version: 2.7.12)
madpack.py: INFO : > Preparing objects for the following modules:
madpack.py: INFO : > - array_ops
madpack.py: ERROR : Failed executing m4 on /usr/local/greenplum-db/madlib/Versions/1.18.0/ports/greenplum/modules/array_ops/array_ops.sql_in
madpack.py: ERROR : Building database objects failed. Before retrying: drop madlib schema OR install MADlib into a different schema.

注意:如果安装时报错如上,说明缺少了m4依赖。请先使用root用户安装m4。

[root@gpmdw ~]$ yum install m4 -y

m4安装完成后,切换回gpadmin,执行安装

[gpadmin@gpmdw ~]$ $GPHOME/madlib/bin/madpack install -s madlib -p greenplum -c gpadmin@gpmdw:5432/test
madpack.py: INFO : Detected Greenplum DB version 6.19.1.
madpack.py: INFO : *** Installing MADlib ***
madpack.py: INFO : MADlib tools version = 1.18.0 (/usr/local/greenplum-db-6.19.1/madlib/Versions/1.18.0/bin/../madpack/madpack.py)
madpack.py: INFO : MADlib database version = None (host=gpmdw:5432, db=test, schema=madlib)
madpack.py: INFO : Testing PL/Python environment...
madpack.py: INFO : > PL/Python environment OK (version: 2.7.12)
madpack.py: INFO : > Preparing objects for the following modules:
madpack.py: INFO : > - array_ops
madpack.py: INFO : > - bayes
madpack.py: INFO : > - crf
madpack.py: INFO : > - elastic_net
madpack.py: INFO : > - linalg
madpack.py: INFO : > - pmml
madpack.py: INFO : > - prob
madpack.py: INFO : > - sketch
madpack.py: INFO : > - svec
madpack.py: INFO : > - svm
madpack.py: INFO : > - tsa
madpack.py: INFO : > - stemmer
madpack.py: INFO : > - conjugate_gradient
madpack.py: INFO : > - knn
madpack.py: INFO : > - lda
madpack.py: INFO : > - stats
madpack.py: INFO : > - svec_util
madpack.py: INFO : > - utilities
madpack.py: INFO : > - assoc_rules
madpack.py: INFO : > - convex
madpack.py: INFO : > - dbscan
madpack.py: INFO : > - deep_learning
madpack.py: INFO : > - glm
madpack.py: INFO : > - graph
madpack.py: INFO : > - linear_systems
madpack.py: INFO : > - recursive_partitioning
madpack.py: INFO : > - regress
madpack.py: INFO : > - sample
madpack.py: INFO : > - summary
madpack.py: INFO : > - kmeans
madpack.py: INFO : > - pca
madpack.py: INFO : > - validation
madpack.py: INFO : Installing MADlib:
madpack.py: INFO : > Created madlib schema
madpack.py: INFO : > Created madlib.MigrationHistory table
madpack.py: INFO : > Wrote version info in MigrationHistory table
madpack.py: INFO : MADlib 1.18.0 installed successfully in madlib schema.

MADlib已成功添加到test数据库中。

3. 卸载MADlib

使用gpadmin用户,确保Greenplum正常运行。

[gpadmin@gpmdw ~]$ gpstate
20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:-Starting gpstate with args:
20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 6.19.1 build commit:0e314744a460630073b46cea7b7cf20a81e3da63'
20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 9.4.26 (Greenplum Database 6.19.1 build commit:0e314744a460630073b46cea7b7cf20a81e3da63) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 6.4.0, 64-bit compiled on Jan 18 2022 13:41:23'
20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:-Gathering data from segments...
20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:-Greenplum instance status summary
20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:-----------------------------------------------------
20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:- Master instance = Active
20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:- Master standby = No master standby configured
20220209:10:01:46:001062 gpstate:gpmdw:gpadmin-[INFO]:- Total segment instance count from metadata = 12

如果没有数据库使用 MADlib 函数,请使用 Greenplum 实用程序,并选择卸载 MADlib 包。删除包时,必须指定包和版本。此示例卸载 MADlib 软件包版本 1.18.0。使用如下命令卸载即可:

[gpadmin@gpmdw ~]$ gppkg -r madlib-1.18.0+2-gp6-rhel7-x86_64
20220209:10:02:15:000737 gppkg:gpmdw:gpadmin-[INFO]:-Starting gppkg with args: -r madlib-1.18.0+2-gp6-rhel7-x86_64
20220209:10:02:15:000737 gppkg:gpmdw:gpadmin-[INFO]:-Uninstalling package madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg
20220209:10:02:15:000737 gppkg:gpmdw:gpadmin-[INFO]:-Validating rpm uninstallation cmdStr='rpm --test -e madlib-1.18.0-1 --dbpath /usr/local/greenplum-db-6.19.1/share/packages/database'
20220209:10:02:16:000737 gppkg:gpmdw:gpadmin-[INFO]:-Validating rpm uninstallation cmdStr='rpm --test -e madlib-1.18.0-1 --dbpath /usr/local/greenplum-db-6.19.1/share/packages/database'
20220209:10:02:16:000737 gppkg:gpmdw:gpadmin-[INFO]:-Uninstalling rpms cmdStr='rpm -e madlib-1.18.0-1 --dbpath /usr/local/greenplum-db-6.19.1/share/packages/database'
20220209:10:02:16:000737 gppkg:gpmdw:gpadmin-[INFO]:-Completed local uninstallation of madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg.
20220209:10:02:16:000737 gppkg:gpmdw:gpadmin-[INFO]:-madlib-1.18.0+2-gp6-rhel7-x86_64.gppkg successfully uninstalled.

如果没有数据库使用 MADlib 函数,请使用 Greenplum 实用程序,并选择卸载 MADlib 包。删除包时,必须指定包和版本。此示例卸载 MADlib 软件包版本 1.18.0。

卸载完成后,重启数据库:

[gpadmin@gpmdw ~]$ gpstop -r

深圳市金鑫泉科技有限公司做为Greenplum的全球合作伙伴,可以为您提供优质、高效的服务。欢迎联系。
本文来源:https://blog.csdn.net/SZJXQ2021/article/details/123628512

相关文章