我是 LEE,老李,一个在 IT 行业摸爬滚打 16 年的技术老兵。
事件背景
最近我们的 Knative 的应用管理和发布平台上线了,有了工具平台,那么监控报警就是下一个非常重要的环节,后面的应用报警就水到渠成了。
通过 Knative 官方 Serving 模块中的监控报警文档实践,发现官方提供的解决方案是一个极其麻烦的方案。也许是出发点不一样,他们倾向建立一个全新的系统,可是现在 k8s 系统普及这么多了,难道还有集群不使用 Promthues/Thanos 的嘛?我想有更简单的办法就能解决监控的问题,不要用复杂的方法来接解决问题。
顺便多说一嘴,Knative 官方提供的 Grafana 的监控大盘也非常不好用,没有真正贴合到实际使用需要。
准备工具
Tips: 我们这边使用 VictoriaMetrics 替换了 Thanos, 因为在大数据查询和写入的量情况下 Thanos 实在是表现的不太好,所以最后使用了 VictoriaMetrics。
这个是我们平台版本的情况:
- Kubernetes: 1.23
- Istio: 1.13
- Knative: 1.5
- Grafana: 8.3.3
- VictoriaMetrics: 1.79
具体实操
既然打算用自己的方法来监控 Knative Serving 的控制层,那么 Knative 官方的文档就没有什么参考价值了。
监控控制层
一个简单的 knative 会有如下几个简单的组件构成:
NAME READY STATUS RESTARTS AGE
activator-58b96bdb7d-nf6hf 1/1 Running 0 30d
autoscaler-75c4975cd8-bg2nt 1/1 Running 0 30d
controller-66475c8469-d5w2h 1/1 Running 0 30d
domain-mapping-68768c5ddc-999ng 1/1 Running 0 30d
domainmapping-webhook-d4bbcb544-bjtfz 1/1 Running 0 30d
net-istio-controller-689d984c59-4vtdx 1/1 Running 0 27d
net-istio-webhook-74f9465d86-jtj72 1/1 Running 0 27d
webhook-996d56c7-ms6js 1/1 Running 0 30d
那么就可以针对这些组件定制合适的 metrics 抓取方案。当然抓取前,我们还是稍微浏览下 Deployment 里面的配置情况。
这里用 activator 为例:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
labels:
app.kubernetes.io/component: activator
app.kubernetes.io/name: knative-serving
app.kubernetes.io/version: 1.5.0
name: activator
namespace: knative-serving
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: activator
role: activator
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
annotations:
cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
creationTimestamp: null
labels:
app: activator
app.kubernetes.io/component: activator
app.kubernetes.io/name: knative-serving
app.kubernetes.io/version: 1.5.0
role: activator
spec:
containers:
- env:
- name: GOGC
value: "500"
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: POD_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
- name: SYSTEM_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- name: CONFIG_LOGGING_NAME
value: config-logging
- name: CONFIG_OBSERVABILITY_NAME
value: config-observability
- name: METRICS_DOMAIN
value: knative.dev/internal/serving
image: knative-serving/activator:1.5.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 12
httpGet:
httpHeaders:
- name: k-kubelet-probe
value: activator
path: /
port: 8012
scheme: HTTP
initialDelaySeconds: 15
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: activator
ports:
- containerPort: 9090
name: metrics ## 就是这里,提供 9090 端口作为 metrics 数据读取接口
protocol: TCP
- containerPort: 8008
name: profiling
protocol: TCP
- containerPort: 8012
name: http1
protocol: TCP
- containerPort: 8013
name: h2c
protocol: TCP
readinessProbe:
failureThreshold: 5
httpGet:
httpHeaders:
- name: k-kubelet-probe
value: activator
path: /
port: 8012
scheme: HTTP
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
cpu: "1"
memory: 600Mi
requests:
cpu: 300m
memory: 60Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- all
readOnlyRootFilesystem: true
runAsNonRoot: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: controller
serviceAccountName: controller
terminationGracePeriodSeconds: 600
通过对 activator 的 Deployment 内容阅读得知,9090 端口(命名:metrics,后面用的到)是对外提供指标的位置。随后我们对 Knative Serving 中其他的组件提供指标的接口做了统计,做了如下列表:
组件名 | Port | 别名 | 描述 |
---|---|---|---|
activator | 9090 | metrics | 连接缓冲器,是 Knative 重要流量转发组件。负责应用从 0->1/1->0 过程中 http 请求缓存。 |
autoscaler | 9090 | metrics | 扩容控制器,是 Knative 控制应用 Pod 副本数量重要组件。根据 queue-proxy 和 activator 反馈的数据决定 pod 启动数量。 |
controller | 9090 | metrics | 控制器,是 Knative 控制器服务协调所有公共 Knative 对象和自动伸缩 crd。当用户将 Knative 服务应用到 Kubernetes API 时,这会创建配置和路由。 |
webhook | 9090 | metrics | 钩子,是 Knative 控制层与 Kubernetes 沟通重要组件。拦截所有 Kubernetes API 调用以及所有 CRD 插入和更新。它设置默认值和拒绝不一致和无效的对象,并验证和改变 Kubernetes API 调用。 |
从上面的列表真正对业务有实质性影响的就是这 4 个模块。既然如此,我们就方便编写抓取监控的 Job 了。 这里以 VictoriaMetrics 平台为例:
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMPodScrape
metadata:
name: controller-monitor
namespace: knative-serving
spec:
namespaceSelector:
matchNames:
- knative-serving
podMetricsEndpoints:
- path: /metrics
scheme: http
targetPort: metrics # 这里就是面提到的接听端口 9090 的别名
selector:
matchLabels:
app: controller
---
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMPodScrape
metadata:
name: autoscaler-monitor
namespace: knative-serving
spec:
namespaceSelector:
matchNames:
- knative-serving
podMetricsEndpoints:
- path: /metrics
scheme: http
targetPort: metrics # 这里就是面提到的接听端口 9090 的别名
selector:
matchLabels:
app: autoscaler
---
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMPodScrape
metadata:
name: activator-monitor
namespace: knative-serving
spec:
namespaceSelector:
matchNames:
- knative-serving
podMetricsEndpoints:
- path: /metrics
scheme: http
targetPort: metrics # 这里就是面提到的接听端口 9090 的别名
selector:
matchLabels:
app: activator
---
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMPodScrape
metadata:
name: webhook-monitor
namespace: knative-serving
spec:
namespaceSelector:
matchNames:
- knative-serving
podMetricsEndpoints:
- path: /metrics
scheme: http
targetPort: metrics # 这里就是面提到的接听端口 9090 的别名
selector:
matchLabels:
app: webhook
我编写了 4 个 PodScrape 任务来监控控制层 Pod 的 metrics,数据被自动收集到了 VictoriaMetrics,后面方便 Grafana 来做 Dashboard。
监控已发布应用
依葫芦画瓢,当然抓发布应用的 metrics 取前,我们还是稍微浏览下 Deployment 里面的配置情况。这里用 test-app-18 为例:
apiVersion: v1
kind: Pod
metadata:
annotations:
autoscaling.knative.dev/class: kpa.autoscaling.knative.dev
autoscaling.knative.dev/initial-scale: "1"
autoscaling.knative.dev/max-scale: "6"
autoscaling.knative.dev/metric: rps
autoscaling.knative.dev/min-scale: "1"
autoscaling.knative.dev/target: "60"
kubernetes.io/limit-ranger: "LimitRanger plugin set: ephemeral-storage request
for container app; ephemeral-storage limit for container app; ephemeral-storage
request for container queue-proxy; ephemeral-storage limit for container queue-proxy"
serving.knative.dev/creator: system:serviceaccount:default:oms-admin
creationTimestamp: "2022-08-04T07:05:03Z"
generateName: test-app-18-ac403-deployment-988b7b66f-
labels:
k_type: knative # 这里很重要,通过这个 label 我们区分这个pod 是 knative 应用的pod,还是普通的 pod
app: test-app-18
app_id: test-app-18
pod-template-hash: 988b7b66f
service.istio.io/canonical-name: test-app-18
service.istio.io/canonical-revision: test-app-18-ac403
serving.knative.dev/configuration: test-app-18
serving.knative.dev/configurationGeneration: "4"
serving.knative.dev/configurationUID: d896cd40-ce9c-4027-9229-4af9f2aa5630
serving.knative.dev/revision: test-app-18-ac403
serving.knative.dev/revisionUID: 1b3dc38f-5aed-4252-a07b-aefc32f7f9f9
serving.knative.dev/service: test-app-18
serving.knative.dev/serviceUID: 0af741d0-a74f-44dd-ab6e-458a5d3743a2
name: test-app-18-ac403-deployment-988b7b66f-tlw27
namespace: knative-apps
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: test-app-18-ac403-deployment-988b7b66f
uid: 429f5cc4-20e6-4f85-a985-4da1de578844
resourceVersion: "755594194"
uid: c632adba-5c66-4c0b-ac31-67c8c231b591
spec:
containers:
- env:
- name: PORT
value: "8080"
- name: K_REVISION
value: test-app-18-ac403
- name: K_CONFIGURATION
value: test-app-18
- name: K_SERVICE
value: test-app-18
image: knative-apps/fn_test-app-18_qa@sha256:e86ed5117e91b4d11f9e169526d734981deb31c99744d65cb6a6debf9262d97f
imagePullPolicy: IfNotPresent
lifecycle:
preStop:
httpGet:
path: /wait-for-drain
port: 8022
scheme: HTTP
livenessProbe:
failureThreshold: 3
httpGet:
httpHeaders:
- name: K-Kubelet-Probe
value: queue
path: /ping
port: 8080
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: app
ports:
- containerPort: 8080
name: user-port
protocol: TCP
resources:
limits:
cpu: "2"
ephemeral-storage: 7Gi
memory: 4Gi
requests:
cpu: 200m
ephemeral-storage: 256Mi
memory: 409Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: FallbackToLogsOnError
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-jp8zk
readOnly: true
- env:
- name: SERVING_NAMESPACE
value: knative-apps
- name: SERVING_SERVICE
value: test-app-18
- name: SERVING_CONFIGURATION
value: test-app-18
- name: SERVING_REVISION
value: test-app-18-ac403
- name: QUEUE_SERVING_PORT
value: "8012"
- name: QUEUE_SERVING_TLS_PORT
value: "8112"
- name: CONTAINER_CONCURRENCY
value: "0"
- name: REVISION_TIMEOUT_SECONDS
value: "10"
- name: SERVING_POD
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: SERVING_POD_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.podIP
- name: SERVING_LOGGING_CONFIG
- name: SERVING_LOGGING_LEVEL
- name: SERVING_REQUEST_LOG_TEMPLATE
value: '{"httpRequest": {"requestMethod": "{{.Request.Method}}", "requestUrl":
"{{js .Request.RequestURI}}", "requestSize": "{{.Request.ContentLength}}",
"status": {{.Response.Code}}, "responseSize": "{{.Response.Size}}", "userAgent":
"{{js .Request.UserAgent}}", "remoteIp": "{{js .Request.RemoteAddr}}", "serverIp":
"{{.Revision.PodIP}}", "referer": "{{js .Request.Referer}}", "latency": "{{.Response.Latency}}s",
"protocol": "{{.Request.Proto}}"}, "traceId": "{{index .Request.Header "X-B3-Traceid"}}"}'
- name: SERVING_ENABLE_REQUEST_LOG
value: "false"
- name: SERVING_REQUEST_METRICS_BACKEND
value: prometheus
- name: TRACING_CONFIG_BACKEND
value: none
- name: TRACING_CONFIG_ZIPKIN_ENDPOINT
- name: TRACING_CONFIG_DEBUG
value: "false"
- name: TRACING_CONFIG_SAMPLE_RATE
value: "0.1"
- name: USER_PORT
value: "8080"
- name: SYSTEM_NAMESPACE
value: knative-serving
- name: METRICS_DOMAIN
value: knative.dev/internal/serving
- name: SERVING_READINESS_PROBE
value: '{"httpGet":{"path":"/ping","port":8080,"host":"127.0.0.1","scheme":"HTTP","httpHeaders":[{"name":"K-Kubelet-Probe","value":"queue"}]},"successThreshold":1}'
- name: ENABLE_PROFILING
value: "false"
- name: SERVING_ENABLE_PROBE_REQUEST_LOG
value: "false"
- name: METRICS_COLLECTOR_ADDRESS
- name: CONCURRENCY_STATE_ENDPOINT
- name: CONCURRENCY_STATE_TOKEN_PATH
value: /var/run/secrets/tokens/state-token
- name: HOST_IP
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: status.hostIP
- name: ENABLE_HTTP2_AUTO_DETECTION
value: "false"
image: knative-serving/queue:1.5.0
imagePullPolicy: IfNotPresent
name: queue-proxy
ports:
- containerPort: 8022
name: http-queueadm
protocol: TCP
- containerPort: 9090
name: http-autometric
protocol: TCP
- containerPort: 9091
name: http-usermetric # 就是这里,提供 9091 端口作为 metrics 数据读取接口。因为应用的流量都被 Queue 转发,所以在这里统计最好。
protocol: TCP
- containerPort: 8012
name: queue-port
protocol: TCP
- containerPort: 8112
name: https-port
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
httpHeaders:
- name: K-Network-Probe
value: queue
path: /
port: 8012
scheme: HTTP
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
ephemeral-storage: 7Gi
requests:
cpu: 25m
ephemeral-storage: 256Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- all
readOnlyRootFilesystem: true
runAsNonRoot: true
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-jp8zk
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: false
imagePullSecrets:
- name: key.key
nodeName: 10.11.96.79
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 10
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 120
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 120
volumes:
- name: kube-api-access-jp8zk
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
通过对 test-app-18 的 Deployment 内容阅读得知,9091 端口(命名:http-usermetric,后面用的到)是对外提供指标的位置。 这里我也类似的做了一个通用的能够抓取任何 Namespace 中 Knative 应用 Pod 流量情况的 Job(这里有一个挑战:应用的 Namespace 不确定,就需要对所有 Namespace 适配)。
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMPodScrape
metadata:
name: custom-apps-monitor
namespace: knative-serving
spec:
namespaceSelector:
any: true # 这个表示匹配任何 Namespace
podMetricsEndpoints:
- path: /metrics
scheme: http
targetPort: http-usermetric
selector:
matchLabels:
k_type: knative # 匹配真实的 Pod 区分应用类型的标签
编写了通用的 PodScrape 任务来监控应用 Pod 的 metrics,数据被自动收集到了 VictoriaMetrics,后面方便 Grafana 来做 Dashboard。
最终效果
在接入 Grafana 以后,我这边也没有用 Knative 社区的模板,发现很多不一定有用。最后决定自定义个比较有意义的监控模板。