仅以此篇记录虚拟机linux安装kerberos对hdfs&yarn认证配置的过程,以昨日后学习之用
本篇是为了安装kerberos对hdfs&yarn认证配置的过程,hadoop的集群安装hadoop集群配置已经涵盖不在此篇赘述。
希望读者可以通读全篇之后加上自己的理解然后参照进行配置。
本篇完全借鉴相关博主配置,在此基础上补充路径,添加配置等操作,更加详细以便日后配置学习之用,参考链接放到文章末尾。
一、hdfs配置kerberos认证
1.所有节点执行以下操作master、slave1、slave2
添加hdfs用户并修改hdfs的属组
groupadd hdfs
useradd hdfs -g hdfs
cat /etc/passwd
chown -R hdfs:hdfs /usr/local/hadoop-3.3.4/
slave1、slave2创建目录和修改属组
2、所有节点安装autoconf
yum install autoconf -y
3.所有节点安装gcc
yum install gcc -y
4.所有节点安装jsvc
tar -zxvf commons-daemon-1.2.2-src.tar.gz -C /usr/local
cd /usr/local/commons-daemon-1.2.2-src/src/native/unix
./support/buildconf.sh
./configure
make
#commons-daemon-1.2.2-src.tar.gz百度网盘链接
#链接:https://pan.baidu.com/s/15bb9UH_HIZk6CwnmnxNhog
#提取码:PiZZ
检查是否安装完成
cd /usr/local/commons-daemon-1.2.2-src/src/native/unix/
./jsvc -help
ln -s /usr/local/commons-daemon-1.2.2-src/src/native/unix/jsvc /usr/local/bin/jsvc
5.修改hdfs-env.sh的配置文件
5.1进入hadoop路径下并修改文件
cd /usr/local/hadoop-3.3.4/etc/hadoop/
vi hadoop-env.sh
#添加一下信息并保存退出
export JSVC_HOME=/usr/local/commons-daemon-1.2.2-src/src/native/unix
export HDFS_DATANODE_SECURE_USER=hdfs
5.2分发到其他节点
scp /usr/local/hadoop-3.3.4/etc/hadoop/hadoop-env.sh root@slave1:/usr/local/hadoop-3.3.4/etc/hadoop/hadoop-env.sh
scp /usr/local/hadoop-3.3.4/etc/hadoop/hadoop-env.sh root@slave2:/usr/local/hadoop-3.3.4/etc/hadoop/hadoop-env.sh
#或者可以登录slave1、slave2分别修改hdfs-env.sh 保存退出
5.3新增配置mapred-site.xml
cd /usr/local/hadoop-3.3.4/etc/hadoop/
vi mapred-site.xml
<!--MapReduce 历史工作信息服务 IPC 地址 -->
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<!--MapReduce 历史工作信息服务 Web 地址 -->
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
5.4修改slave1、slave2节点配置
ssh slave1
cd /usr/local/hadoop-3.3.4/etc/hadoop/
vi mapred-site.xml
#添加5.3中配置,并保存退出
exit
ssh slave2
cd /usr/local/hadoop-3.3.4/etc/hadoop/
vi mapred-site.xml
#添加5.3中配置,并保存退出
exit
6.创建hdfs的principal
如果找不到kadmin.local命令,可以使用find / -name kadmin 来查找一般的位置为:/usr/bin/kadmin
6.1先输入kadmin.local,然后依次输入
addprinc hdfs/master
addprinc hdfs/slave1
addprinc hdfs/slave2
addprinc http/master
addprinc http/slave1
addprinc http/slave2
继续执行
ktadd -norandkey -k /etc/security/keytab/hdfs.keytab hdfs/master
ktadd -norandkey -k /etc/security/keytab/hdfs.keytab hdfs/slave1
ktadd -norandkey -k /etc/security/keytab/hdfs.keytab hdfs/slave2
ktadd -norandkey -k /etc/security/keytab/http.keytab http/master
ktadd -norandkey -k /etc/security/keytab/http.keytab http/slave1
ktadd -norandkey -k /etc/security/keytab/http.keytab http/slave2
6.2解决报错:如果执行第一个就报错,先看看图片别急着操作往下看
需要先进行文件夹的创建,先看有没有哈,没有的话再创建
mkdir /etc/security/keytab
执行完成后
#退出
exit
6.3合并keytab
ktutil
rkt hdfs.keytab
rkt http.keytab
wkt hdfs.keytab
exit
7.分发秘钥文件
scp hdfs.keytab http.keytab root@slave1:/usr/local/hadoop-3.3.4/etc/
scp hdfs.keytab http.keytab root@slave2:/usr/local/hadoop-3.3.4/etc/
8.修改hdfs的配置文件
8.1修改core-site.xml文件
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
</property>
8.2修改修改hdfs-site.xml
<!--name名称解释
参考链接:https://blog.csdn.net/zhanglong_4444/article/details/99471502-->
<!-- 当为true时,允许访问令牌访问datanode。 -->
<property>
<name>dfs.block.access.token.enable</name>
<value>true</value>
</property>
<!--namenode服务主体。这通常设置为nn/_HOST@REALM.TLD。
每个namenode将在启动时用它自己的完全限定主机名替换宿主。
_HOST占位符允许在HA设置中使用两个namenode上相同的配置设置。-->
<property>
<name>dfs.namenode.kerberos.principal</name>
<value>hdfs/master@HADOOP.COM</value>
</property>
<property>
<name>dfs.namenode.keytab.file</name>
<value>/usr/local/hadoop-3.3.4/etc/hdfs.keytab</value>
</property>
<property>
<name>dfs.namenode.kerberos.internal.spnego.principal</name>
<value>http/hadoop@HADOOP.COM</value>
</property>
<property>
<name>dfs.namenode.kerberos.internal.spnego.keytab</name>
<value>http/master@HADOOP.COM</value>
</property>
<property>
<name>dfs.web.authentication.kerberos.principal</name>
<value>hdfs/master@HADOOP.COM</value>
</property>
<property>
<name>dfs.web.authentication.kerberos.keytab</name>
<value>/usr/local/hadoop-3.3.4/etc/hdfs.keytab</value>
</property>
<property>
<name>dfs.datanode.kerberos.principal</name>
<value>hdfs/master@HADOOP.COM</value>
</property>
<property>
<name>dfs.datanode.keytab.file</name>
<value>/usr/local/hadoop-3.3.4/etc/hdfs.keytab</value>
</property>
<property>
<name>dfs.datanode.address</name>
<value>0.0.0.0:1004</value>
</property>
<property>
<name>dfs.datanode.http.address</name>
<value>0.0.0.0:1006</value>
</property>
如果有secondnamenode,则还需要加下面的配置
<property>
<name>dfs.secondary.namenode.keytab.file</name>
<value>/usr/local/hadoop-3.3.4/etc/hdfs.keytab</value>
</property>
<property>
<name>dfs.secondary.namenode.kerberos.principal</name>
<value>hdfs/master@HADOOP.COM</value>
</property>
8.3修改yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
#以上为参考博主补充的配置,以下为图片中的配置
<property>
<name>yarn.resourcemanager.principal</name>
<value>hdfs/master@HADOOP.COM</value>
</property>
<property>
<name>yarn.resourcemanager.keytab</name>
<value>/usr/local/hadoop-3.3.4/etc/hdfs.keytab</value>
</property>
<property>
<name>yarn.nodemanager.keytab</name>
<value>/usr/local/hadoop-3.3.4/etc/hdfs.keytab</value>
</property>
<property>
<name>yarn.nodemanager.principal</name>
<value>hdfs/master@HADOOP.COM</value>
</property>
分发配置文件到其他节点
cd /usr/local/hadoop-3.3.4/etc/hadoop
scp core-site.xml hdfs-site.xml yarn-site.xml root@slave1:/usr/local/hadoop-3.3.4/etc/hadoop/
scp core-site.xml hdfs-site.xml yarn-site.xml root@slave2:/usr/local/hadoop-3.3.4/etc/hadoop/
9.启动hdfs
进入/usr/local/hadoop-3.3.4/sbin
cd /usr/local/hadoop-3.3.4/sbin
执行下面的脚本启动hdfs
start-dfs.sh
Root用户执行下面的脚本
./start-secure-dns.sh
检查进程,这里需要注意,jps是看不到datenode的进程的
ps auxf |grep datanode
10.验证
hdfs dfs -ls /
kinit -kt /etc/security/keytab/hdfs.keytab hdfs/master
hdfs dfs -ls /
二.配置yarn的kerberos认证
1.配置yarn-site.xml配置文件
进入/usr/local/hadoop-3.3.4/etc/hadoop
cd /usr/local/hadoop-3.3.4/etc/hadoop
修改yarn-site.xml
<property>
<name>yarn.nodemanager.container-executor.class</name>
<value>org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor</value>
</property>
<property>
<name>yarn.nodemanager.linux-container-executor.group</name>
<value>hadoop</value>
</property>
<property>
<name>yarn.nodemanager.linux-container-executor.path</name>
<value>/hdp/bin/container-executor</value>
</property>
<!--yarn.nodemanager.linux-container-executor.path指定了container-executor的路径,
container-executor是可执行二进制文件,它需要一个配置文件:
yarn.nodemanager.linux-container-executor.group是nodemanager的启动用户所属组-->
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>/hdp/yarn/local/nm-local-dir</value>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>/hdp/yarn/log/userlogs</value>
</property>
yarn.nodemanager.linux-container-executor.path指定了container-executor的路径,container-executor是可执行二进制文件,它需要一个配置文件:
yarn.nodemanager.linux-container-executor.group是nodemanager的启动用户所属组
2.确认container-executor路径
进入/usr/local/hadoop-3.3.4/bin
cd /usr/local/hadoop-3.3.4/bin
strings container-executor |grep etc
3.创建目录,拷贝可执行文件和配置文件到指定目录
mkdir -p /hdp/bin
mkdir -p /hdp/etc/hadoop
scp /usr/local/hadoop-3.3.4/bin/container-executor /hdp/bin/
scp /usr/local/hadoop-3.3.4/etc/hadoop/container-executor.cfg /hdp/etc/hadoop/
#此处用cp就行 不非得用scp
进入到/hdp/etc/hadoop/
cd /hdp/etc/hadoop/
修改配置文件的内容如下
yarn.nodemanager.linux-container-executor.group=hadoop
banned.users=hdfs,yarn,mapred,bin
min.user.id=500
allowed.system.users=root
yarn.nodemanager.local-dirs=/hdp/yarn/local/nm-local-dir
yarn.nodemanager.log-dirs=/hdp/yarn/log/userlogs
4.修改可执行文件的属组
修改权限
chmod 6050 /hdp/bin/container-executor
ll /hdp/bin/container-executor
5.做如下检查,如果输出一致,则container-executor配置完成
hadoop checknative
/hdp/bin/container-executor --checksetup
6.拷贝hdp目录 到其他节点,需要设置相同的属组和权限
ssh slave1
mkdir -p /hdp/etc/hadoop
exit
ssh slave2
mkdir -p /hdp/etc/hadoop
exit
scp /hdp/etc/hadoop/container-executor.cfg root@slave1:/hdp/etc/hadoop/
scp /hdp/etc/hadoop/container-executor.cfg root@slave2:/hdp/etc/hadoop/
7.启动yarn
进入路径/usr/local/hadoop-3.3.4/sbin
cd /usr/local/hadoop-3.3.4/sbin
#启动yarn
./start-yarn.sh
8.验证yarn on kerberos配置完成,能正常执行即可
进入路径/usr/local/hadoop-3.3.4/
cd /usr/local/hadoop-3.3.4
./bin/hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount
hdfs dfs -ls
#不报错有文件说明没问题
至此结束配置
以下为配置过程中遇到的错误。大家参考一下即可
三.报错解决
报错1:
ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
org.apache.hadoop.security.KerberosAuthException: failure to login: for principal: hdfs/master@HADOOP.COM from keytab /usr/local/hadoop-3.3.4/etc/hdfs.keytab javax.security.auth.login.LoginException: Unable to obtain password from user
#master、slave1、slave2都需要执行
chmod 777 /usr/local/hadoop-3.3.4/etc/hdfs.keytab
报错2:
缺少libcrypto.so.1.1
https://www.jianshu.com/p/4f15ae32c5c8
报错3:这个是ticket过期之类的错误请参考
链接:https://www.ibm.com/docs/en/db2-big-sql/6.0?topic=security-kdc-cant-fulfill-requested-option-while-renewing-credentials-errors-when-running-db2-big-sql-statements-kerberized-cluster
kadmin.local -q "modprinc -maxrenewlife max_renewable_life_value krbtgt/HADOOP.COM"
modprinc -maxrenewlife "1 week" +allow_renewable hdfs/master@HADOOP.COM
modprinc -maxrenewlife "1 week" +allow_renewable hive/master@HADOOP.COM
报错4:执行chown root:hadoop container-executor报错 无效的组
添加hadoop组
groupadd hadoop
chown root:hadoop container-executor
报错5:
ERROR [main] service.CompositeService: Error starting services HiveServer2
java.lang.RuntimeException: Failed to init thrift server
解决链接https://blog.csdn.net/wangshuminjava/article/details/82462675
报错6:
2023-04-27 00:19:52,119 WARN security.UserGroupInformation: Exception encountered while running the renewal command for hive/master@HADOOP.COM. (TGT end time:1682500979000, renewalFailures: 0, renewalFailuresTotal: 1)
ExitCodeException exitCode=1: kinit: Ticket expired while renewing credentials
这个需要重新生成一下keytab然后传到对应位置上替换
报错7:
ERROR [HiveServer2-Handler-Pool: Thread-118] server.TThreadPoolServer: Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Peer indicated failure: GSS initiate failed
这个未解决,idea连接调试过程中出现的问题,我觉得通过kinit -kt重新验证应该能行
报错8:
Browse Directory
Failed to obtain user group information: java.io.IOException: Security enabled but user not authenticated by filter
前台访问master:9870报错,这个一直未解决,百度不到答案。可能是配置哪里的问题。
原文档中未出现的配置
添加container-executor.cfg配置,并赋予权限
yarn.nodemanager.local-dirs=/hdp/yarn/local/nm-local-dir
yarn.nodemanager.log-dirs=/hdp/yarn/log/userlogs
mkdir -p /hdp/yarn/local/nm-local-dir
mkdir -p /hdp/yarn/log/userlogs
chown yarn:hadoop /hdp/yarn/local/nm-local-dir
chown yarn:hadoop /hdp/yarn/log/userlogs
chmod 755 /hdp/yarn/local/nm-local-dir
chmod 755 /hdp/yarn/log/userlogs
后记说明
1.斜体:
斜体为需要读者注意的点,这边是还有未解决的问题。
2.加黑粗体:
强调注意事项。
3.补充知识
博主简单整理:https://www.cnblogs.com/chwilliam85/p/9679845.html
官方文档:https://web.mit.edu/kerberos/krb5-latest/doc/admin/database.html?highlight=addprinc#adding-modifying-and-deleting-principals