搭建前期系统要求:
1、关闭swap;
临时关闭 swapoff -a
永久关闭 : sudo echo 'vm.swappiness=0'>> /etc/sysctl.conf
重启
2、
root用户登录
1、修改hostname(centos 7)(每一台)
hostnamectl set-hostname hadoop106
配置/etc/hosts(主节点)
2、创建用户(每一台)
groupadd hadoop
useradd hadoop -g hadoop
passwd hadoop
3、vim /etc/sudoers
注意权限 第一次需要chmod 640 /etc/sudoers
(每一台)
hadoop ALL =(ALL) NOPASSWD: ALL
4、vim /etc/security/limits.conf
(这步可以通过root也可以用别的用户sudo实现)
* soft nofile 32768
* soft nproc 65536
* hard nofile 1048576
* hard nproc unlimited
* hard memlock unlimited
* soft memlock unlimited
5、安装jdk (hadoop用户)
sudo rpm -ivh /opt/software/jdk-8u191-linux-x64.rpm
6、vim .bash_profile
JAVA_HOME=/usr/java/default
PATH=$PATH:$HOME/.local/bin:$HOME/bin:$JAVA_HOME
export JAVA_HOME
export PATH
source .bash_profile
7、一些前期工作(hadoop用户)
ssh-keygen -t rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
8、sudo systemctl stop firewalld.service
sudo systemctl disable firewalld.service
vim /etc/selinux/config
SELINUX=enforcing 改为 disabled
9、非root安装mysql数据库,安装好修改root密码后执行
CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
CREATE DATABASE sentry DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
CREATE DATABASE navms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON scm.* TO 'scm'@'%' IDENTIFIED BY 'scm';
GRANT ALL ON amon.* TO 'amon'@'%' IDENTIFIED BY 'amon';
GRANT ALL ON rman.* TO 'rman'@'%' IDENTIFIED BY 'rman';
GRANT ALL ON hue.* TO 'hue'@'%' IDENTIFIED BY 'hue';
GRANT ALL ON metastore.* TO 'hive'@'%' IDENTIFIED BY 'hive';
GRANT ALL ON sentry.* TO 'sentry'@'%' IDENTIFIED BY 'sentry';
GRANT ALL ON nav.* TO 'nav'@'%' IDENTIFIED BY 'nav';
GRANT ALL ON navms.* TO 'navms'@'%' IDENTIFIED BY 'navms';
GRANT ALL ON oozie.* TO 'oozie'@'%' IDENTIFIED BY 'oozie';
flush privileges;
amon对应Activity Monitor
10、sudo mkdir /opt/cloudera-manager
(每一台)
上传CDH-5.13.2-1.cdh5.13.2.p0.3-el7.parcel CDH-5.13.2-1.cdh5.13.2.p0.3-el7.parcel.sha1 cloudera-manager-centos7-cm5.13.2_x86_64.tar.gz manifest.json
11、sudo tar xzf cloudera-manager*.tar.gz -C /opt/cloudera-manager
12、创建用户(每一台)
sudo userdel cloudera-scm
sudo groupdel cloudera-scm
sudo useradd --system --home=/opt/cloudera-manager/cm-5.13.2/run/cloudera-scm-server --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scm
13、创建目录(每一台)
sudo mkdir -p /var/lib/cloudera-scm-server
sudo chown cloudera-scm:cloudera-scm /var/lib/cloudera-scm-server
sudo mkdir /var/log/cloudera-scm-server
sudo chown cloudera-scm:cloudera-scm /var/log/cloudera-scm-server
14、cd /opt/cloudera-manager/cm-5.13.2/etc/cloudera-scm-agent
vim config.ini
server_host=hadoop105
Port that the CM server is listening on.
server_port=7182
15、sudo cp mysql-connector-java-5.1.42.jar /opt/cloudera-manager/cm-5.13.2/share/cmf/lib/
创建CM安装平台的数据库:
sudo /opt/cloudera-manager/cm-5.13.2/share/cmf/schema/scm_prepare_database.sh mysql cm -hlocalhost -uroot -p --scm-host localhost scm scm scm
cm-5.13.2这个目录传到3台上
16、sudo mkdir -p /opt/cloudera/parcel-repo
sudo chown cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo
CDH-5.13.2-1.cdh5.13.2.p0.3-el7.parcel、CDH-5.13.2-1.cdh5.13.2.p0.3-el7.parcel.sha1
和manifest.json
拷贝到/opt/cloudera/parcel-repo
这个目录下,CDH-5.13.2-1.cdh5.13.2.p0.3-el7.parcel.sha1
这个要改名为CDH-5.13.2-1.cdh5.13.2.p0.3-el7.parcel.sha
sudo mv CDH-5.13.2-1.cdh5.13.2.p0.3-el7.parcel /opt/cloudera/parcel-repo/
sudo mv CDH-5.13.2-1.cdh5.13.2.p0.3-el7.parcel.sha1 /opt/cloudera/parcel-repo/CDH-5.13.2-1.cdh5.13.2.p0.3-el7.parcel.sha
sudo mv manifest.json /opt/cloudera/parcel-repo/
每一台
sudo mkdir -p /opt/cloudera/parcels
sudo chown cloudera-scm:cloudera-scm /opt/cloudera/parcels
17、sudo chown -R cloudera-scm:cloudera-scm /opt/cloudera-manager
(每一台)
18、cd /opt/cloudera-manager/cm-5.13.2/etc/init.d
sudo ./cloudera-scm-server start
报了一个错:./cloudera-scm-server: line 109: pstree: command not found
解决sudo rpm -ivh psmisc-22.20-15.el7.x86_64.rpm
sudo ./cloudera-scm-agent start
19、登录http://192.168.17.106:7180/cmf/login
下一步需要耗费一点时间,等待完成
sudo echo 'vm.swappiness=10'>> /etc/sysctl.conf
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo never > /sys/kernel/mm/transparent_hugepage/enabled
这边最好选择Activity Monitor,下面是关于其的一个解释,
安装hive
CREATE DATABASE metastore DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
GRANT ALL ON metastore.* TO 'hiveuser'@'%' IDENTIFIED BY 'hiveuser';
启动报错 Failed to load driver
sudo cp mysql-connector-java-5.1.42.jar /opt/cloudera/parcels/CDH/lib/hive/lib/
jvm内存调整 64G
hbase.regionserver.lease.period 2分钟
hbase.rpc.timeout
CM离线配置spark2.4.0
- parcel包检查更新
点击已分配
激活
添加服务
不过这步没有发现spark2,需要sudo ./cloudera-scm-server restart
和sudo ./cloudera-scm-agent restart
CM安装kafka
1、选择版本
2、这里我选择了3.1.0对应el 7的版本
http://archive.cloudera.com/kafka/parcels/3.1.0/
3、上传
/opt/cloudera/parcel-repo
sudo mv /home/hadoop/software/KAFKA-3.1.0-1.3.1.0.p0.35-el7.parcel ./
sudo mv /home/hadoop/software/KAFKA-3.1.0-1.3.1.0.p0.35-el7.parcel.sha1 ./KAFKA-3.1.0-1.3.1.0.p0.35-el7.parcel.sha
sudo mv /home/hadoop/software/manifest.json ./ (如果原来有的话就将原来的改名,比如我原来cdh 的改为manifest.json.cdh)
http://archive.cloudera.com/csds/kafka-1.2.0/ 下载 KAFKA-1.2.0.jar,
sudo mkdir -p /opt/cloudera/csd
sudo chown cloudera-scm:cloudera-scm /opt/cloudera/csd
sudo mv /home/hadoop/software/KAFKA-1.2.0.jar /opt/cloudera/csd/
4、激活parcel
选择Parcel,这里开始是没有的,需要点击检查新Parcel,最后点击分配等待完成;
这里花了好长时间,好几次都没有分配,应该是上传的文件有问题了。
Src file /opt/cloudera/parcels/.flood/KAFKA-3.1.0-1.3.1.0.p0.35-el7.parcel/KAFKA-3.1.0-1.3.1.0.p0.35-el7.parcel does not exist
到已分配,点击激活;
5、修改配置文件vim server.properties(每一台)
cd /opt/cloudera/parcels/KAFKA/etc/kafka/conf.dist
修改如下几个参数:
broker.id=0(每个节点不一样,从0开始累加)
num.partitions=3
num.recovery.threads.per.data.dir=10(用于读取log日志文件,对应于每个目录的线程恢复数)
zookeeper.connect=hadoop106:2181
然后需要每台都改一下
6、添加服务
选择kafka
题外话MirrorMaker的作用:
该方案解决Kafka跨集群同步、创建Kafka集群镜像等相关问题,主要使用Kafka内置的MirrorMaker工具实现;
https://cloud.tencent.com/developer/article/1358933
gateway
这里因为之前没有添加manifest.json,它报了一个错,找不到对应包的问题。
设置HA
1、单机的HDFS安装成功
2、开始
3、设置名称
4、选择服务器
5、启动
这边启动namenode报了一个错Journal Storage Directory /mnt/dfs/jn/uprofile-cluster not formatted
看了jn的目录/mnt/dfs/jn下都为空,所以执行了sudo hdfs namenode -initializeSharedEdits,但报了一个错
namenode.NameNode: No shared edits directory configured for namespace null namenode null
最后发现是防火墙没关
另外碰到这种错直接跳过