Prometheus Operator 学习笔记(二)Service Endpoints 重置问题

文档说明

实验环境:kubernetes Version v1.10.9
网络CNI:fannel
存储CSI: NFS Dynamic Class
DNS: CoreDNS

背景

部署完Prometheus Operator之后 在Prometheus的 Alert监控事项中会收到 Kube-scheduler和kube-controller-manager的告警信息
KubeSchedulerDown

alert: KubeSchedulerDown
expr: absent(up{job="kube-scheduler"}
  == 1)
for: 15m
labels:
  severity: critical
annotations:
  message: KubeScheduler has disappeared from Prometheus target discovery.
  runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeschedulerdown

KubeControllerManagerDown

alert: KubeControllerManagerDown
expr: absent(up{job="kube-controller-manager"}
  == 1)
for: 15m
labels:
  severity: critical
annotations:
  message: KubeControllerManager has disappeared from Prometheus target discovery.
  runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecontrollermanagerdown

查到问题是由于kube-schedulerkube-controller-manager的Endpoints地址被重置成none导致的

接下来就开启了查原因的漫漫之路,困扰了一个星期

  • kube-scheduler和kube-controller-manager 的启动参数
  • Kubernetes Endpoints Controller源码分析
  • service 和 endpoints 官方文档定义

kube-controller-manager 的启动参数

参考了XuXinkun Blog 的这篇博客的问题分析定位,我查看了集群位于/var/log/syslog下的kubernetes的日志,发现了同样的NoteReady日志输出,认为kube-controller-manager判断node上报心跳超时的时间默认为40秒,存在一定几率的超时导致,所以一开始以为找到了问题的原因,立马参照他的解决方法调整kube-scheduler.service的启动参数--node-monitor-grace-period duration=60s

--node-monitor-grace-period duration     Default: 40s
Amount of time which we allow running Node to be unresponsive before marking it unhealthy. Must be N times more than kubelet's nodeStatusUpdateFrequency, where N means number of retries allowed for kubelet to post node status.

观察重新apply -f endpoints文件之后,还是存在endpoints变成none,问题还是存在!!!

起码知道了可能导致这个问题的原因,就继续查问题,还是通过查看了集群位于/var/log/syslog下的kubernetes的日志,过滤每一条有价值的日志,发现日志的信息中Timeout的时长常常维持在7-9分钟这样一个期间,所以我索性改--node-monitor-grace-period duration=600s 这样就不存在node上报心跳超时问题

观察重新apply -f endpoints文件之后,还是存在endpoints变成none,问题还是存在!!! 感觉整个人都怀疑人生了

期间不停的google相关的资料,看到了修复 Service Endpoint 更新的延迟 这篇博客,又有新的线索,这篇博客中,貌似是更新延迟的问题,顺便也对这个Endpoints更新的机制做了了解,还把集群
kube-controller-manager 的启动参数--kube-api-qps 和 --kube-api-burst 改大--kube-api-qps=300--kube-api-burst=325--concurrent-endpoints-syncs=30

--concurrent-endpoint-syncs int32     Default: 5
The number of endpoint syncing operations that will be done concurrently. Larger number = faster endpoint updating, but more CPU (and network) load
--kube-api-qps float32     Default: 20
QPS to use while talking with kubernetes apiserver.
--kube-api-burst int32     Default: 30
Burst to use while talking with kubernetes apiserver.

观察重新apply -f endpoints文件之后,还是存在endpoints变成none,问题还是存在!!! 又失去了问题的线索

Kubernetes Endpoints Controller源码分析

endpoints_controller.go 的核心逻辑syncService

func (e *EndpointController) syncService(key string) error {
    startTime := time.Now()
    defer func() {
        klog.V(4).Infof("Finished syncing service %q endpoints. (%v)", key, time.Since(startTime))
    }()

    namespace, name, err := cache.SplitMetaNamespaceKey(key)
    if err != nil {
        return err
    }
    service, err := e.serviceLister.Services(namespace).Get(name)
    if err != nil {
        // Delete the corresponding endpoint, as the service has been deleted.
        // TODO: Please note that this will delete an endpoint when a
        // service is deleted. However, if we're down at the time when
        // the service is deleted, we will miss that deletion, so this
        // doesn't completely solve the problem. See #6877.
        err = e.client.CoreV1().Endpoints(namespace).Delete(name, nil)
        if err != nil && !errors.IsNotFound(err) {
            return err
        }
        return nil
    }

    if service.Spec.Selector == nil {
        // services without a selector receive no endpoints from this controller;
        // these services will receive the endpoints that are created out-of-band via the REST API.
        return nil
    }

    klog.V(5).Infof("About to update endpoints for service %q", key)
    pods, err := e.podLister.Pods(service.Namespace).List(labels.Set(service.Spec.Selector).AsSelectorPreValidated())
    if err != nil {
        // Since we're getting stuff from a local cache, it is
        // basically impossible to get this error.
        return err
    }

    // If the user specified the older (deprecated) annotation, we have to respect it.
    tolerateUnreadyEndpoints := service.Spec.PublishNotReadyAddresses
    if v, ok := service.Annotations[TolerateUnreadyEndpointsAnnotation]; ok {
        b, err := strconv.ParseBool(v)
        if err == nil {
            tolerateUnreadyEndpoints = b
        } else {
            utilruntime.HandleError(fmt.Errorf("Failed to parse annotation %v: %v", TolerateUnreadyEndpointsAnnotation, err))
        }
    }

    subsets := []v1.EndpointSubset{}
    var totalReadyEps int = 0
    var totalNotReadyEps int = 0

    for _, pod := range pods {
        if len(pod.Status.PodIP) == 0 {
            klog.V(5).Infof("Failed to find an IP for pod %s/%s", pod.Namespace, pod.Name)
            continue
        }
        if !tolerateUnreadyEndpoints && pod.DeletionTimestamp != nil {
            klog.V(5).Infof("Pod is being deleted %s/%s", pod.Namespace, pod.Name)
            continue
        }

        epa := *podToEndpointAddress(pod)

        hostname := pod.Spec.Hostname
        if len(hostname) > 0 && pod.Spec.Subdomain == service.Name && service.Namespace == pod.Namespace {
            epa.Hostname = hostname
        }

        // Allow headless service not to have ports.
        if len(service.Spec.Ports) == 0 {
            if service.Spec.ClusterIP == api.ClusterIPNone {
                subsets, totalReadyEps, totalNotReadyEps = addEndpointSubset(subsets, pod, epa, nil, tolerateUnreadyEndpoints)
                // No need to repack subsets for headless service without ports.
            }
        } else {
            for i := range service.Spec.Ports {
                servicePort := &service.Spec.Ports[i]

                portName := servicePort.Name
                portProto := servicePort.Protocol
                portNum, err := podutil.FindPort(pod, servicePort)
                if err != nil {
                    klog.V(4).Infof("Failed to find port for service %s/%s: %v", service.Namespace, service.Name, err)
                    continue
                }

                var readyEps, notReadyEps int
                epp := &v1.EndpointPort{Name: portName, Port: int32(portNum), Protocol: portProto}
                subsets, readyEps, notReadyEps = addEndpointSubset(subsets, pod, epa, epp, tolerateUnreadyEndpoints)
                totalReadyEps = totalReadyEps + readyEps
                totalNotReadyEps = totalNotReadyEps + notReadyEps
            }
        }
    }
    subsets = endpoints.RepackSubsets(subsets)

    // See if there's actually an update here.
    currentEndpoints, err := e.endpointsLister.Endpoints(service.Namespace).Get(service.Name)
    if err != nil {
        if errors.IsNotFound(err) {
            currentEndpoints = &v1.Endpoints{
                ObjectMeta: metav1.ObjectMeta{
                    Name:   service.Name,
                    Labels: service.Labels,
                },
            }
        } else {
            return err
        }
    }

    createEndpoints := len(currentEndpoints.ResourceVersion) == 0

    if !createEndpoints &&
        apiequality.Semantic.DeepEqual(currentEndpoints.Subsets, subsets) &&
        apiequality.Semantic.DeepEqual(currentEndpoints.Labels, service.Labels) {
        klog.V(5).Infof("endpoints are equal for %s/%s, skipping update", service.Namespace, service.Name)
        return nil
    }
    newEndpoints := currentEndpoints.DeepCopy()
    newEndpoints.Subsets = subsets
    newEndpoints.Labels = service.Labels
    if newEndpoints.Annotations == nil {
        newEndpoints.Annotations = make(map[string]string)
    }

    klog.V(4).Infof("Update endpoints for %v/%v, ready: %d not ready: %d", service.Namespace, service.Name, totalReadyEps, totalNotReadyEps)
    if createEndpoints {
        // No previous endpoints, create them
        _, err = e.client.CoreV1().Endpoints(service.Namespace).Create(newEndpoints)
    } else {
        // Pre-existing
        _, err = e.client.CoreV1().Endpoints(service.Namespace).Update(newEndpoints)
    }
    if err != nil {
        if createEndpoints && errors.IsForbidden(err) {
            // A request is forbidden primarily for two reasons:
            // 1. namespace is terminating, endpoint creation is not allowed by default.
            // 2. policy is misconfigured, in which case no service would function anywhere.
            // Given the frequency of 1, we log at a lower level.
            klog.V(5).Infof("Forbidden from creating endpoints: %v", err)
        }
        return err
    }
    return nil
}

Service的Add/Update/Delete Event Handler都是将Service Key加入到Queue中,等待worker进行syncService处理,syncService方法的逻辑都是建立在通过LabelSelector进行Pod匹配,将匹配的Pods构建对应的Endpoints Subsets加入到Endpoints中,因此这里会先过滤掉那些没有LabelSelector的Services,而上一篇完Prometheus Operator 之后 在监控二进制组件Kube-scheduler和kube-controller-manager以及后续的etcd集群的时候,由于部署方式采用的是非Pod形式在集群内运行

    if service.Spec.Selector == nil {
        // services without a selector receive no endpoints from this controller;
        // these services will receive the endpoints that are created out-of-band via the REST API.
        return nil
    }

注释突然提醒了我,立马查看了我的service的yaml文件,感觉找到了问题的根源:
由于非Pod形式在集群内运行,所以sevice的yaml文件就不需要定义selector 去过滤pod的标签

特意去查看了service的官方文档

image.png

所以根据ServiceMonitor—> Service—>endpoints(pod) 服务发现机制labelselector标签来做关系绑定 就需要做调整,统一把非pod形式的service的selector字段去掉。

观察重新apply -f endpoints文件之后,问题解决!!!

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 205,132评论 6 478
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 87,802评论 2 381
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 151,566评论 0 338
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,858评论 1 277
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,867评论 5 368
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,695评论 1 282
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 38,064评论 3 399
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,705评论 0 258
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 42,915评论 1 300
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,677评论 2 323
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,796评论 1 333
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,432评论 4 322
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,041评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,992评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,223评论 1 260
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 45,185评论 2 352
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,535评论 2 343

推荐阅读更多精彩内容