cloudera cdh 6.3 安装配置

张映 发表于 2019-11-28

分类目录: hadoop/spark/scala

标签:, ,

cdh在国内用的比较多。不管是cloudera+cdh或者是ambari+hdp,建议初学者不要用,还是从原生的开始。网上说原生的不稳定,配置复杂。

我用的hadoop2.7.7稳定,以及以hadoop2.7.7为基础构建的生态圈很稳定,在线率100%,到目前为止还没有出现过突发故障。

以hadoop为基础构建一套生态圈,是很复杂。但是拆分开了,采取蚂蚁搬家的方式,就简单多了。并且能了解各组件间是怎么协同工作的。

一,服务器说明

10.0.40.237 bigserver1   //namenode
10.0.40.222 bigserver2   //datanode
10.0.40.193 bigserver3   //datanode
10.0.40.200 bigserver4   //SecondaryNameNode

cdh版,根以前独立版的hadoop服务器(测试服务器)有一台机器不一样。175换成了200

二,设置hostname,所有节点

# hostname bigserver1 

# vim /etc/hosts //添加以下内容,所有节点都一样

10.0.40.237 bigserver1
10.0.40.222 bigserver2
10.0.40.193 bigserver3
10.0.40.200 bigserver4

三,关闭防火墙和selinux,所有节点

# systemctl stop firewalld //停止
# systemctl disable firewalld //取消启动 

# cat /etc/sysconfig/selinux
SELINUX=disabled //关闭

操作完,最好是所有节点重启一下

四,ssh免密码登录,所有节点

# ssh-keygen -t rsa 

# ssh-copy-id -i ~/.ssh/id_rsa.pub root@10.0.40.222 -p 22
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@10.0.40.193 -p 22
# ssh-copy-id -i ~/.ssh/id_rsa.pub root@10.0.40.200 -p 22 

# scp ~/.ssh/id_rsa root@10.0.40.222:/root/.ssh/
# scp ~/.ssh/id_rsa root@10.0.40.193:/root/.ssh/
# scp ~/.ssh/id_rsa root@10.0.40.200:/root/.ssh/ 

登录到222,193,175后
# cd ~/.ssh/
# chmod 600 id_rsa

注意:各主机间先用私钥登录一下,这样known_hosts才会有值。

五,配置时间服务器,所有节点

请参考:centos7 hadoop2.7.7 hbase1.4安装配置详解

六,安装jdk,所有节点

# yum install epel-release   //这个库,包很多
# yum install java-1.8.0-openjdk java-1.8.0-openjdk-devel

建议使用oracle jdk,我没用是因为以前装独立hadoop就用的openjdk。

Cloudera Enterprise Version Supported Oracle JDK Supported OpenJDK
5.3 -5.15 1.7, 1.8 none
5.16 and higher 5.x releases 1.7, 1.8 1.8
6.0 1.8 none
6.1 1.8 1.8
6.2 1.8 1.8
6.3 1.8 1.8, 11.0.3 or higher

七,cdh6.3.1组件介绍

Component Component Version
Apache Avro 1.8.2
Apache Flume 1.9.0
Apache Hadoop 3.0.0
Apache HBase 2.1.4
HBase Indexer 1.5
Apache Hive 2.1.1
Hue 4.3.0
Apache Impala 3.2.0
Apache Kafka 2.2.1
Kite SDK 1.0.0
Apache Kudu 1.10.0
Apache Solr 7.4.0
Apache Oozie 5.1.0
Apache Parquet 1.9.0
Parquet-format 2.4.0
Apache Pig 0.17.0
Apache Sentry 2.1.0
Apache Spark 2.4.0
Apache Sqoop 1.4.7
Apache ZooKeeper 3.4.5

八,配置cloudera离线库,并安装cloudera-manager,bigserver1节点

1,安装nginx和createrepo

# yum install nginx createrepo
# mkdir -p /var/www/html/cloudera-repos  //cloudera rpm路径
# vim /etc/nginx/conf.d/cloudera.conf  //nginx配置
server
{
 listen 80;
 server_name bigserver1;
 root /var/www/html;
 autoindex on;
 autoindex_exact_size off;
 autoindex_localtime on;
 charset utf-8;
}

# systemctl restart nginx  //重启nginx

2,下载cloudera-manager

https://archive.cloudera.com/cm6/6.3.1/redhat7/yum/RPMS/x86_64/

allkeys.asc
cloudera-manager-agent-6.3.1-1466458.el7.x86_64.rpm
cloudera-manager-daemons-6.3.1-1466458.el7.x86_64.rpm
cloudera-manager-server-6.3.1-1466458.el7.x86_64.rpm
cloudera-manager-server-db-2-6.3.1-1466458.el7.x86_64.rpm
oracle-j2sdk1.8-1.8.0+update181-1.x86_64.rpm

下载oracle-j2sdk1.8-1.8.0,其实是为了java版本的统一。并且可以直接yum安装。

allkeys.asc不要忘下,不然后面会报错。报错信息忘下来了。

3,创建本地离线cloudera库

# mv *.rpm /var/www/html/cloudera-repos   //rpm包移动到仓库路径下
# mv allkeys.asc /var/www/html/cloudera-repos
# cd /var/www/html/cloudera-repos && createrepo . //创建repo,注意有个点
Spawning worker 0 with 2 pkgs
Spawning worker 1 with 2 pkgs
Workers Finished
Saving Primary metadata
Saving file lists metadata
Saving other metadata
Generating sqlite DBs
Sqlite DBs complete

# vim /etc/yum.repos.d/cloudera-manager.repo
[cloudera-manager]
name=Cloudera Manager 6.3.1
baseurl=http://bigserver1/cloudera-repos/
gpgcheck=0
enabled=1

不用copy到cloudera-manager.repo到其他节点,其他节点不用安装cloudera-manager-agent,图形界面的时候,会自动copy,并安装。

不用copy到cloudera-manager.repo到其他节点,其他节点不用安装cloudera-manager-agent,图形界面的时候,会自动copy,并安装。

不用copy到cloudera-manager.repo到其他节点,其他节点不用安装cloudera-manager-agent,图形界面的时候,会自动copy,并安装。

4,安装cloudera-manager

# yum clean all && yum makecache
# yum install cloudera-manager-daemons cloudera-manager-agent cloudera-manager-server

注意:如果以前单独装cdh中的组件要删除掉,确认一个纯净的安装。后面环境检查,也不会有问题

已有cdh的组件就删除

已有cdh的组件就删除

# rpm -qa |grep impala //查找所有implala,然后删除

# rpm -e impala-2.12.0+cdh5.16.2+0-1.cdh5.16.2.p0.22.el7.x86_64 --nodeps
# rpm -e impala-udf-devel-2.12.0+cdh5.16.2+0-1.cdh5.16.2.p0.22.el7.x86_64 --nodeps
# rpm -e impala-catalog-2.12.0+cdh5.16.2+0-1.cdh5.16.2.p0.22.el7.x86_64 --nodeps
# rpm -e impala-state-store-2.12.0+cdh5.16.2+0-1.cdh5.16.2.p0.22.el7.x86_64 --nodeps
# rpm -e impala-shell-2.12.0+cdh5.16.2+0-1.cdh5.16.2.p0.22.el7.x86_64 --nodeps

九,配置离线parcel库,bigserver1节点

1,下载parcels

https://archive.cloudera.com/cdh6/6.3.1/parcels/

CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel
CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel.sha1
manifest.json

CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel.sha1要重命名为CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel.sha

2,建立parcel-repo离线库

# chown cloudera-scm:cloudera-scm -R /opt/cloudera/parcel-repo
# ll
总用量 2035080
-rw-r--r-- 1 cloudera-scm cloudera-scm 2083878000 11月 25 12:34 CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel
-rw-r--r-- 1 cloudera-scm cloudera-scm 40 10月 11 16:45 CDH-6.3.1-1.cdh6.3.1.p0.1470567-el7.parcel.sha
-rw-r--r-- 1 cloudera-scm cloudera-scm 33887 10月 11 16:44 manifest.json

注意:/opt/cloudera/parcel-repo,这个目录,不是手动创建的。安装完cloudera-manager-server,就会有。图形界面安装时,会自动同步给其他节点。

十,安装mysql5.7

1,安装mysql,bigserver1节点

参考:hive mysql 安装配置

my.cnf官方推荐配置

# vim /etc/my.cnf
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
symbolic-links = 0

key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1

max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M

#log_bin should be on a disk with enough free space.
#Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your
#system and chown the specified folder to the mysql user.
log_bin=/var/lib/mysql/mysql_binary_log

#In later versions of MySQL, if you enable the binary log and do not set
#a server_id, MySQL will not start. The server_id must be unique within
#the replicating group.
server_id=1

binlog_format = mixed

read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M

# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

sql_mode=STRICT_ALL_TABLES

2,安装mysql-connector-java,所有节点

# wget https://cdn.mysql.com//archives/mysql-connector-java-5.1/mysql-connector-java-5.1.47.tar.gz
# tar zxvf mysql-connector-java-5.1.47.tar.gz
# cd mysql-connector-java-5.1.47
# cp mysql-connector-java-5.1.47-bin.jar /usr/share/java/mysql-connector-java.jar

安装独立hadoop的时候就是用yum安装mysql-connector-java没有问题。但是用了cdh6以后就不行了,yum装的版本旧了。centos7.4安装的是mysql-connector-java-5.1.25-3。后面启动hive.metastore的时候,会报以下错误:

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to retrieve schema tables from Hive Metastore DB,Not supported

3,创建数据库,并分配权限,bigserver1节点

CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
CREATE DATABASE navms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

grant all privileges on *.* TO cdh6@'10.%' IDENTIFIED BY 'Cdh6_123';
flush privileges;

以下是组件与库名对照表:

组件名 数据库名
Cloudera Manager Server scm
Activity Monitor amon
Reports Manager rman
Hue hue
Hive Metastore Server metastore
Sentry Server sentry
Cloudera Navigator Audit Server nav
Cloudera Navigator Metadata Server navms
Oozie oozie

4,初始化数据库,在bigserver1节点

# /opt/cloudera/cm/schema/scm_prepare_database.sh -h bigserver1 mysql scm cdh6 Cdh6_123
JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.191.b12-1.el7_6.x86_64
Verifying that we can write to /etc/cloudera-scm-server
Creating SCM configuration file in /etc/cloudera-scm-server
Executing: /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.191.b12-1.el7_6.x86_64/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/java/postgresql-connector-java.jar:/opt/cloudera/cm/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
[ main] DbCommandExecutor INFO Successfully connected to database.
All done, your SCM database is configured correctly!

5,启动cloudera-scm-server,在bigserver1节点

# systemctl start cloudera-scm-server

# tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log

# netstat -tpnl |grep 7180
tcp 0 0 0.0.0.0:7180 0.0.0.0:* LISTEN 28344/java

注意:cloudera-scm-server启动的时间挺长,用tail查看server日志,如果7180端口起来了。说明cloudera-scm-server,启动好了。

十一,hadoop集群安装

1,登录

http://bigserver1:7180/cmf/login,默认用户名密码都是admin

2,图解

1,登录

1,登录

2,发行说明

2,发行说明

3,同意条款

3,同意条款

4,Cloudera版本选择,一般都选免费

4,Cloudera版本选择,一般都选免费

5,集群安装的欢迎页面

5,集群安装的欢迎页面

6,设置集群名称

6,设置集群名称

7,设置集群中,有多少台机器

7,设置集群中,有多少台机器

8,选择离线库

8,选择离线库

9,是否需要安装jdk,前面安装了,就不用装了

9,是否需要安装jdk,前面安装了,就不用装了

10,上传私钥

10,上传私钥

11,安装agent客户端

11,安装agent客户端

12.1,主机运行状况不良

12.1,主机运行状况不良

主机运行状况不良,解决办法:所有节点都要操作

# cd /var/lib/cloudera-scm-agent/
# rm -f cm_guid
# service cloudera-scm-agent restart
12.2,安装parcels进行中

12.2,安装parcels进行中

13.1,集群中的机器检测

13.1,集群中的机器检测

解决办法:所有节点

# echo never > /sys/kernel/mm/transparent_hugepage/defrag
# echo never > /sys/kernel/mm/transparent_hugepage/enabled

# echo vm.swappiness=10 >> /etc/sysctl.conf
# sysctl -p
13.2,已无warning

13.2,已无warning

14,集群配置

14,集群配置

15.1,自定义角色配置

15.1,自定义角色配置

15.2,自定义角色配置

15.2,自定义角色配置

15.3,自定义角色配置

15.3,自定义角色配置

在这里要注意,什么机器装什么服务,根自己的实际情况来。

16,集群数据库设置

16,集群数据库设置

17.1,路径默认设置

17.1,路径默认设置

17.2,kudu路径设置(必填)

17.2,kudu路径设置(必填)

17.3,kafka路径设置(必填)

17.3,kafka路径设置(必填)

在这里要注意,solr的路径不要改动,不然会报错。

在这里要注意,solr的路径不要改动,不然会报错。

在这里要注意,solr的路径不要改动,不然会报错。

18,检测通过了

18,检测通过了

19,安装完成

19,安装完成

到这儿就安装完成了,但是紧紧是安装完成,后面会陆续出一些使用情况和问题。



转载请注明
作者:海底苍鹰
地址:http://blog.51yip.com/hadoop/2249.html

留下评论

留下评论
  • (必需)
  • (必需) (will not be published)
  • (必需)   4X8=?