/usr/sbin/kadmin.local执行卡住

问题描述

在机器上执行/usr/sbin/kadmin.local时卡住,通过 systemctl重启kadmin、kdc server也都会因为超时启动失败。

[root@master_10.0.0.58 /data/emr/krb5]$ journalctl -xe -u kadmin
-- Unit kadmin.service has begun starting up.
Jul 18 11:25:59 10.0.0.58 systemd[1]: kadmin.service start operation timed out. Terminating.
Jul 18 11:25:59 10.0.0.58 systemd[1]: Failed to start Kerberos 5 Password-changing and Administration.
-- Subject: Unit kadmin.service has failed

问题处理

使用journal -xe -u kadmin也没有详细的报错信息,由于kerberos是与openldap集成的,所以顺便看了下openldap的日志,发现有很多这样的错误:

Jul 18 11:31:21 10 slapd[11874]: daemon: accept(8) failed errno=24 (Too many open files)
Jul 18 11:31:21 10 slapd[11874]: daemon: accept(7) failed errno=24 (Too many open files)

先使用ps查看openldap进程的pid,然后查了下它能打开的文件数

[root@master_10.0.0.58 /data/emr/openldap/logs]$ cat /proc/11874/limits 
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             127308               127308               processes 
Max open files            1024                 4096                 files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       127308               127308               signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us 

然后查看该进程已经打开的文件数

[root@master_10.0.0.58 /data/emr/openldap/logs]$ lsof -p 11874 | wc -l
1073

显而易见,已经超过了Max open filesSoft Limit,需要增加openldap能打开的文件数。在环境变量文件(/usr/lib/systemd/system/slapd.service文件的EnvironmentFile)中添加LDAP_NOFILE=5000 无效。由于systemctl也会限制进程的文件数,需要在/usr/lib/systemd/system/slapd.service的[Service]下添加

LimitNOFILE=5000

然后执行

systemctl daemon-reload
systemctl restart slapd

再次查看openldap进程的limits,发现设置生效,指令也正常了

[root@master_10.0.0.58 /data/emr/openldap/logs]$ cat /proc/557799/limits 
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             127308               127308               processes 
Max open files            5000                 5000                 files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       127308               127308               signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us 
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容