centos7 hadoop2.7.7 hbase1.4安装配置详解

张映 发表于 2019-09-23

分类目录: hadoop/spark/scala

标签:, , ,

HBase是一个分布式的、面向列的开源数据库,该技术来源于 Fay Chang 所撰写的Google论文“Bigtable:一个结构化数据的分布式存储系统”。就像Bigtable利用了Google文件系统(File System)所提供的分布式数据存储一样,HBase在Hadoop之上提供了类似于Bigtable的能力。HBase是Apache的Hadoop项目的子项目。HBase不同于一般的关系数据库,它是一个适合于非结构化数据存储的数据库。另一个不同的是HBase基于列的而不是基于行的模式。

一,hbase集群说明

主机 IP 节点进程
bigserver1 10.0.40.237 Master、Zookeeper
bigserver2 10.0.40.222 RegionServer、Zookeeper
bigserver3 10.0.40.193 RegionServer
testing 10.0.40.175 Master-backup、RegionServer、Zookeeper

二,ntp时间服务器安装配置

NTP
The clocks on cluster nodes should be synchronized. A small amount of variation is acceptable, but larger amounts of skew can cause erratic and unexpected behavior. Time synchronization is one of the first things to check if you see unexplained problems in your cluster. It is recommended that you run a Network Time Protocol (NTP) service, or another time-synchronization mechanism on your cluster and that all nodes look to the same service for time synchronization. See the Basic NTP Configuration at The Linux Documentation Project (TLDP) to set up NTP.

hbase集群中的服务器,时间相差不大,没什么问题。如果相差大,就会有不可预测的问题。

1,安装ntp

# yum install ntp

2,服务端(10.0.40.237)配置,其他的为客户端从服务器同步时间

# cat /etc/ntp.conf |awk '{if($0 !~ /^$/ && $0 !~ /^#/) {print $0}}'
driftfile /var/lib/ntp/drift
restrict default nomodify notrap nopeer noquery
restrict 127.0.0.1
restrict ::1
restrict 10.0.40.0 mask 255.255.255.0 nomodify notrap
server 210.72.145.44 perfer # 中国国家受时中心
server 202.112.10.36 # 1.cn.pool.ntp.org
server 59.124.196.83 # 0.asia.pool.ntp.org
restrict 210.72.145.44 nomodify notrap noquery
restrict 202.112.10.36 nomodify notrap noquery
restrict 59.124.196.83 nomodify notrap noquery
server 127.0.0.1 # local clock
fudge 127.0.0.1 stratum 10
includefile /etc/ntp/crypto/pw
keys /etc/ntp/keys
disable monitor

3,客户端配置

# cat /etc/ntp.conf |awk '{if($0 !~ /^$/ && $0 !~ /^#/) {print $0}}'
driftfile /var/lib/ntp/drift
restrict default nomodify notrap nopeer noquery
restrict 127.0.0.1
restrict ::1
server 10.0.40.237
restrict 10.0.40.237 nomodify notrap noquery
server 127.0.0.1
fudge 127.0.0.1 stratum 10
includefile /etc/ntp/crypto/pw
keys /etc/ntp/keys
disable monitor

4,启动ntp

# systemctl start ntpd
# systemctl enable ntpd
Created symlink from /etc/systemd/system/multi-user.target.wants/ntpd.service to /usr/lib/systemd/system/ntpd.service.

5,查看状态

# ntpq -p
 remote refid st t when poll reach delay offset jitter
==============================================================================
 bigserver1 202.112.10.36 13 u 33 64 0 0.000 0.000 0.000
 localhost .INIT. 16 l - 64 0 0.000 0.000 0.000
# ntpstat
unsynchronised
 time server re-starting
 polling server every 8 s

三,修改ulimit

Limits on Number of Files and Processes (ulimit)
It is recommended to raise the ulimit to at least 10,000, but more likely 10,240, because the value is usually expressed in multiples of 1024. Each ColumnFamily has at least one StoreFile, and possibly more than six StoreFiles if the region is under load. The number of open files required depends upon the number of ColumnFamilies and the number of regions. The following is a rough formula for calculating the potential number of open files on a RegionServer.

# vim /etc/security/limits.conf

* soft nofile 65535
* hard nofile 65535
* soft nproc 65535
* hard nproc 65535

注释:
* 代表针对所有用户
noproc 是代表最大进程数
nofile 是代表最大文件打开数

需要重新登录,或者重新打开ssh客户端连接,永久生效。

四,hbase安装配置

1,下载hbase

# wget http://mirrors.hust.edu.cn/apache/hbase/stable/hbase-1.4.9-bin.tar.gz
# tar zxvf hbase-1.4.9-bin.tar.gz
# cp -r hbase-1.4.9-bin /bigdata/hbase

2,环境变量

# vim ~/.bashrc
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.191.b12-1.el7_6.x86_64
export HADOOP_HOME=/bigdata/hadoop
export SPARK_HOME=/bigdata/spark
export HIVE_HOME=/bigdata/hive
export ZOOKEEPER_HOME=/bigdata/zookeeper
export HBASE_HOME=/home/bigdata/hbase
export PATH=$ZOOKEEPER_HOME/bin:$HBASE_HOME/bin:$SPARK_HOME/bin:$HIVE_HOME/bin:/bigdata/hadoop/bin:$SQOOP_HOME/bin:$PATH
export CLASSPATH=$JAVA_HOME/lib:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export LD_LIBRARY_PATH=/bigdata/hadoop/lib/native/

# source ~/.bashrc

备注:所有节点,hbase相关的环境变量一样

3,复制hdfs-site.xml到hbase/conf

# ln -s $HADOOP_HOME/etc/hadoop/hdfs-site.xml $HBASE_HOME/conf/hdfs-site.xml

注意:复制$HADOOP_HOME/etc/hadoop/hdfs-site.xml到$HBASE_HOME/conf目录下,这样以保证hdfs与hbase两边一致

4,配置hbase-site.xml

# $HBASE_HOME/conf/hbase-site.xml 

<configuration>
    <property>
        <name>hbase.zookeeper.property.clientPort</name>
        <value>2181</value>
    </property>
    <property>
        <name>hbase.zookeeper.quorum</name>
        <value>bigserver1,bigserver2,testing</value>
        <description>The directory shared by RegionServers.
        </description>
    </property>
    <property>
        <name>hbase.zookeeper.property.dataDir</name>
        <value>/bigdata/zookeeper/data</value>
        <description>
        Property from ZooKeeper config zoo.cfg. 与 zoo.cfg 中配置的一致
 The directory where the snapshot is stored.
        </description>
    </property>
    <property>
        <name>hbase.rootdir</name>
        <value>hdfs://bigdata1/hbase</value>
        <description>The directory shared by RegionServers.
 官网多次强调这个目录不要预先创建,hbase会自行创建,否则会做迁移操作,引发错误
        </description>
    </property>
    <property>
        <name>hbase.cluster.distributed</name>
        <value>true</value>
        <description>分布式集群配置,这里要设置为true,如果是单节点的,则设置为false
 The mode the cluster will be in. Possible values are
 false: standalone and pseudo-distributed setups with managed ZooKeeper
 true: fully-distributed with unmanaged ZooKeeper Quorum (see hbase-env.sh)
        </description>
    </property>
</configuration>

5,配置regionserver文件

# vim $HBASE_HOME/conf/regionservers
bigserver1
bigserver2
bigserver3
testing

6,配置 backup-masters 文件(master备用节点)

# vim $HBASE_HOME/conf/backup-masters
testing

备注:HBase 支持运行多个 master 节点,因此不会出现单点故障的问题,但只能有一个活动的管理节点(active master),其余为备用节点(backup master),编辑 $HBASE_HOME/conf/backup-masters 文件进行配置备用管理节点的主机名

7,配置 hbase-env.sh 文件

# vim $HBASE_HOME/conf/hbase-env.sh

export HBASE_MANAGES_ZK=false  //不用hbase内置zookeeper
# Configure PermSize. Only needed in JDK7. You can safely remove it for JDK8+   //jdk1.8的话,下面二行加上注释
#export HBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -XX:PermSize=128m -XX:MaxPermSize=128m -XX:ReservedCodeCacheSize=256m"
#export HBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -XX:PermSize=128m -XX:MaxPermSize=128m -XX:ReservedCodeCacheSize=256m"

8,scp将配置文件同步,hbase集群中的所有机器

9,启动hbase

# start-hbase.sh

五,测试hbase

1,查看各节点,启动进程

bigserver1:

30753 HMaster
22820 NameNode
27380 QuorumPeerMain
30884 HRegionServer
31062 Jps
6567 ResourceManager
1946 JournalNode
23194 DFSZKFailoverController

bigserver2:

1235 Kafka
17172 HRegionServer
17352 Jps
4314 NodeManager
4155 DataNode
4237 JournalNode
1119 QuorumPeerMain

bigserver3:

10579 HRegionServer
10087 NodeManager
10778 Jps
19788 DataNode

testing:

1521 DFSZKFailoverController
1124 NameNode
9508 HMaster
9876 Jps
32293 QuorumPeerMain
9381 HRegionServer
32347 Kafka
1612 ResourceManager
1375 JournalNode

2,创建表

hbase shell

hbase shell

3,查看web页面

hbase web页面

hbase web页面



转载请注明
作者:海底苍鹰
地址:http://blog.51yip.com/hadoop/2182.html