Cloudera Hadoop 启用kerberos
环境信息
- 操作系统系统:Centos7
- JDK:1.7
- CDH 版本:5.8.4
操作步骤
0.在集群节点/etc/hosts文件中添加kdc服务器。
1.在集群内所有节点安装kerberos client
$ sudo yum install krb5-workstation krb5-libs
2,确保kdc.conf文件中包含以下几项
- max_life = 1d
- max_renewable_life = 7d
- kdc_tcp_ports = 88
3,如果在您的非安全集群中启用了YARN ResouseManager HA,那么在启用Kerberos之前,应该在YARN中清除StateStore znode。
- stop yarn service.
- Go to the YARN service and select Actions > Format State Store.
4.Cloudera Manager Server 安装 openldap-clients
$ sudo yum install openldap-clients
5,创建cloudera-scm/admin用户
$ kadmin -p admin/admin
addprinc -pw <Password> cloudera-scm/admin@HADOOP.COM
6,点击Administration->Security->Enable Kerberos
7,The four checklist items were all completed by the script you’ve already run. Check off each item and select “Continue.”
8,The Kerberos Wizard needs to know the details of what the script configured. Fill in the entries as follows:
KDC Server Host: <KDC host>
Kerberos Security Realm: HADOOP.COM
Kerberos Encryption Types: aes128-cts-hmac-sha1-96
Maximum Renewable Life for Principals: 7 days
Click “Continue.”
9,Do you want Cloudera Manager to manage the krb5.conf files in your cluster? check “Yes”
10,Advanced Configuration Snippet (Safety Valve) for the Default Realm in krb5.conf: kdc = <slave kdc host> ,and then select “Continue.”
11,The Kerberos Wizard is going to create Kerberos principals for the different services in the cluster. To do that it needs a Kerberos Administrator ID.
- The ID created is:
cloudera-scm/admin@HADOOP.COM
- The screen shot shows how to enter this information. Recall the password is:
<cloudera-scm/admin password>
12,OK, you’re ready to let the Kerberos Wizard do its work. so, you can safely select “I’m ready to restart the cluster now” and then click “Continue.”
13,修改配置项
- 在nodemanager节点执行:
sudo rm -rf /hdfs*/yarn/nm/usercache/*
- YARN->Configration->min.user.id修改 1000 -> 500
- 删除 YARN->Configration->banned.users 中的yarn,hdfs
- HDFS->Configration->Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml添加
hadoop.proxyuser.hive.hosts=*,hadoop.proxyuser.hive.groups=*
问题排查:
问题描述:
Requested user hdfs is not whitelisted and has id 598,which is below the minimum allowed 1000
解决方案:修改
min.user.id
值 1000 -> 500
问题描述:
main : run as user is dengsc
main : requested yarn user is dengsc
Can't create directory /hdfs01/yarn/nm/usercache/dengsc/appcache/application_1497517106808_0001 - Permission denied
解决方案:在nodemanager节点执行:sudo rm -rf /hdfs/yarn/nm/usercache/(未启用kerberos前目录权限为yarn:yarn,启用后变成dengsc:yarn,导致权限不兼容)
问题描述:
Error: Failed to open new session: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): User: hive is not allowed to impersonate hive
If the data in your system is not owned by the Hive user (i.e., the user that the Hive metastore runs as), then Hive will need permission to run as the user who owns the data in order to perform compactions. If you have already set up HiveServer2 to impersonate users, then the only additional work to do is assure that Hive has the right to impersonate users from the host running the Hive metastore. This is done by adding the hostname to hadoop.proxyuser.hive.hosts in Hadoop's core-site.xml file. If you have not already done this, then you will need to configure Hive to act as a proxy user. This requires you to set up keytabs for the user running the Hive metastore and add hadoop.proxyuser.hive.hosts and hadoop.proxyuser.hive.groups to Hadoop's core-site.xml file.
参考链接:https://community.hortonworks.com/questions/34468/hive-impersonation-not-working-after-hdp-upgrade-t.html
https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-ConfigurationValuestoSetforCompaction
解决方案:/etc/hadoop/conf/core-site.xml添加参数
hadoop.proxyuser.hive.hosts=*,hadoop.proxyuser.hive.groups=*
问题描述:
YARN Container Usage Aggregation
Failed to run MapReduce job to aggregate YARN container usage metrics.
解决方案:删除 YARN->Configration->banned.users 中的yarn,hdfs.
问题描述
# 关于Hadoop安全集群和非安全集群间Distcp的使用
需求:有两个集群,网络中节点是互通的,现在要用distcp进行文件迁移,但一个集群是非安全集群,一个是配置了kerberos认证的安全集群,怎么执行呢?
前提:两个集群都做了HA配置,所以要通过如下命令查看活动Namenode并获取其IP地址;
HA配置查看活动Namenode:hdfs haadmin -getServiceState nn1或nn2(namenode名);
解决方案:
hadoop distcp -D ipc.client.fallback-to-simple-auth-allowed=true -D dfs.checksum.type=CRC32 webhdfs://namenodeIP:50070/data/ /data
命令中直接回退安全配置,同时采用webhdfs来传输,不过这个采用restfull机制,有IO阻塞风险。
hadoop distcp -D ipc.client.fallback-to-simple-auth-allowed=true webhdfs://namenodeIP:50070/flume/data/ /data/
大文件传输时,去掉校验-D dfs.checksum.type=CRC32
就正常。
问题思考
- CM页面修改kerberos配置后需要停止整个集群才能分发kerberos文件。 --需停止集群所有服务包括CM monitor才能分发krb5.conf文件。
- 页面不支持配置KDC高可用? --可通过自定义配置页添加高可用kdc,需重启集群。
- kerberos用户属组问题? --继承原用户及ACL配置。
- 可能引发组件权限问题?
- 各组件如何平滑过渡?