CDH升级导致EsgynDB启不来的问题分析
现象
近来有一个客户那个经常性的EsgynDB起不来。从CDH Manager看起来Hadoop集群是正常的,但trafodion用户执行hbcheck确显示HBase状态异常。
原因
使用trafodion用户执行hbcheck显示Unavailable,但CDH Manager中显示正常。于是执行hbase shell想检查是否正常执行相关HBase命令,发现hbase可执行文件。
检查HBase和hadoop可执行文件,发现/usr/bin/hbase和/usr/bin/hadoop等相关的软链接被ls到 /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45目录下,但目录并不存在,原因是此环境中的CDH做了升级,从原来的CDH 5.6升级到CDH 5.7,导致软链接问题。
ll /usr/bin/hbase
ll /usr/bin/hadoop
[root@cdh1 ~]# ll /etc/alternatives/hbase
lrwxrwxrwx 1 root root 59 May 9 09:40 /etc/alternatives/hadoop -> /opt/cloudera/parcels/CDH-5.6.-1.cdh5.6.0.p0.45/bin/hbase
[root@cdh1 ~]# ll /etc/alternatives/hadoop
lrwxrwxrwx 1 root root 59 May 9 09:40 /etc/alternatives/hadoop -> /opt/cloudera/parcels/CDH-5.6.-1.cdh5.6.0.p0.45/bin/hadoop
解决
- 修改相关软链接
rm -rf /etc/alternatives/hbase
rm -rf /usr/bin/hadoop
rm -rf /etc/alternatives/zookeeper-client
rm -rf /etc/alternatives/zookeeper-server
rm -rf /etc/alternatives/zookeeper-server-cleanup
rm -rf /etc/alternatives/zookeeper-server-initialize
ln -s /opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/hbase /etc/alternatives/hbase
ln -s /opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/hadoop /usr/bin/hadoop
ln -s /opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/zookeeper-client /etc/alternatives/zookeeper-client
ln -s /opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/etc/zookeeper/conf.dist /etc/alternatives/zookeeper-conf
ln -s /opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/zookeeper-server /etc/alternatives/zookeeper-server
ln -s /opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/zookeeper-server-cleanup /etc/alternatives/zookeeper-server-cleanup
ln -s /opt/cloudera/parcels/CDH-5.7.6-1.cdh5.7.6.p0.6/bin/zookeeper-server-initialize /etc/alternatives/zookeeper-server-initialize
- 使用trafodion用户重新sqgen生成环境变量
pdsh $MY_NODES "cds; mv sqconfig.db sqconfig.db.bak"
sqgen
- 执行hbcheck重新检查
hbcheck
[trafodion@cdh1 ~]$ hbcheck
Stderr being written to the file: /home/trafodion/esgynDB_server-2.4.1/logs/hbcheck.log
ZooKeeper Quorum: cdh3.cluster.cm,cdh2.cluster.cm,cdh1.cluster.cm, ZooKeeper Port : 2181
HBase is available!
HBase version: 1.2.-cdh5.7.6
HMaster: cdh1.cluster.cm,60000,1557366248622
Number of RegionServers available:3
RegionServer #1: cdh1.cluster.cm,60020,1557366248847
RegionServer #2: cdh2.cluster.cm,60020,1557367662923
RegionServer #3: cdh3.cluster.cm,60020,1557366248416
Number of Dead RegionServers:0
Number of regions: 2429
Number of regions in transition:
Average load: 809.6666666666666
相关文章