概述
这里的指标采集,包括 Cilium Operator,Cilium 本身以及 Hubble 吐出的指标。
安装
通过 Helm 来部署 Cilium,其中修改了大量的参数值,以满足 Staging 集群的环境的需求,主要就是污点的容忍、调度的 NodeSelector 以及 SecurityContext 等,另外就是需要开启关于 metrics 的相关的参数,修改后的配置文件与默认的配置的差别如下。
# diff values.yaml vip-proxy-metrics.yaml
150c150
< useDigest: true
---
> useDigest: false
163a164
> kubernetes.io/hostname: 10.189.212.125
168,172c169,172
< - operator: Exists
< # - key: "key"
< # operator: "Equal|Exists"
< # value: "value"
< # effect: "NoSchedule|PreferNoSchedule|NoExecute(1.6 only)"
---
> - operator: Equal
> key: "key"
> value: "cilium"
> effect: "NoExecute"
188c188,190
< extraEnv: []
---
> extraEnv:
> - name: KUBERNETES_SERVICE_HOST
> value: hh-k8s-noah-sc-staging001-master.api.vip.com
215c217
< podSecurityContext:
---
> podSecurityContext: {}
237c239
< privileged: false
---
> privileged: true
538c540
< chainingMode: none
---
> chainingMode: generic-veth
559c561
< customConf: false
---
> customConf: true
577c579
< # configMap: cni-configuration
---
> configMap: cni-configuration
940c942
< useDigest: true
---
> useDigest: false
952c954,958
< tolerations: []
---
> tolerations:
> - operator: Equal
> key: "key"
> value: "cilium"
> effect: "NoExecute"
982,988c988,994
< # enabled:
< # - dns:query;ignoreAAAA
< # - drop
< # - tcp
< # - flow
< # - icmp
< # - http
---
> enabled:
> - dns:query;ignoreAAAA
> - drop
> - tcp
> - flow
> - icmp
> - http
994d999
< enabled: ~
1059c1064
< enabled: true
---
> enabled: false
1105c1110
< enabled: false
---
> enabled: true
1117c1122
< useDigest: true
---
> useDigest: false
1144a1150
> kubernetes.io/hostname: 10.189.212.125
1148c1154,1158
< tolerations: []
---
> tolerations:
> - operator: Equal
> key: "key"
> value: "cilium"
> effect: "NoExecute"
1151c1161,1163
< extraEnv: []
---
> extraEnv:
> - name: KUBERNETES_SERVICE_HOST
> value: hh-k8s-noah-sc-staging001-master.api.vip.com
1257c1269
< enabled: false
---
> enabled: true
1293c1305
< enabled: false
---
> enabled: true
1338c1350
< useDigest: true
---
> useDigest: false
1342c1354,1356
< securityContext: {}
---
> securityContext:
> privileged: true
>
1345c1359,1361
< extraEnv: []
---
> extraEnv:
> - name: KUBERNETES_SERVICE_HOST
> value: hh-k8s-noah-sc-staging001-master.api.vip.com
1369c1385
< useDigest: true
---
> useDigest: false
1373c1389,1391
< securityContext: {}
---
> securityContext:
> privileged: true
>
1376c1394,1396
< extraEnv: []
---
> extraEnv:
> - name: KUBERNETES_SERVICE_HOST
> value: hh-k8s-noah-sc-staging001-master.api.vip.com
1395c1415
< enabled: true
---
> enabled: false
1429a1450
> kubernetes.io/hostname: 10.189.212.125
1433c1454,1458
< tolerations: []
---
> tolerations:
> - operator: Equal
> key: "key"
> value: "cilium"
> effect: "NoExecute"
1585c1610
< # kubeProxyReplacement: "true"
---
> kubeProxyReplacement: "true"
1625c1650
< enableIPv4Masquerade: true
---
> enableIPv4Masquerade: false
1773c1798
< enabled: false
---
> enabled: true
1855c1880
< useDigest: true
---
> useDigest: false
1864c1889,1891
< extraEnv: []
---
> extraEnv:
> - name: KUBERNETES_SERVICE_HOST
> value: hh-k8s-noah-sc-staging001-master.api.vip.com
1970a1998
> kubernetes.io/hostname: 10.189.212.125
1975,1979c2003,2006
< - operator: Exists
< # - key: "key"
< # operator: "Equal|Exists"
< # value: "value"
< # effect: "NoSchedule|PreferNoSchedule|NoExecute(1.6 only)"
---
> - operator: Equal
> key: "key"
> value: "cilium"
> effect: "NoExecute"
2120c2147
< routingMode: ""
---
> routingMode: "native"
2146c2173
< useDigest: true
---
> useDigest: false
2164,2168c2191,2194
< - operator: Exists
< # - key: "key"
< # operator: "Equal|Exists"
< # value: "value"
< # effect: "NoSchedule|PreferNoSchedule|NoExecute(1.6 only)"
---
> - operator: Equal
> key: "key"
> value: "cilium"
> effect: "NoExecute"
2179a2206
> kubernetes.io/hostname: 10.189.212.125
2258c2285
< useDigest: true
---
> useDigest: false
2263c2290
< replicas: 2
---
> replicas: 1
2297a2325
> kubernetes.io/hostname: 10.189.212.125
2302,2306c2330,2333
< - operator: Exists
< # - key: "key"
< # operator: "Equal|Exists"
< # value: "value"
< # effect: "NoSchedule|PreferNoSchedule|NoExecute(1.6 only)"
---
> - operator: Equal
> key: "key"
> value: "cilium"
> effect: "NoExecute"
2312c2339,2341
< extraEnv: []
---
> extraEnv:
> - name: KUBERNETES_SERVICE_HOST
> value: hh-k8s-noah-sc-staging001-master.api.vip.com
2389c2418
< enabled: false
---
> enabled: true
2434c2463
< restart: true
---
> restart: false
2458c2487,2489
< extraEnv: []
---
> extraEnv:
> - name: KUBERNETES_SERVICE_HOST
> value: hh-k8s-noah-sc-staging001-master.api.vip.com
2472a2504
> kubernetes.io/hostname: 10.189.212.125
2477,2481c2509,2512
< - operator: Exists
< # - key: "key"
< # operator: "Equal|Exists"
< # value: "value"
< # effect: "NoSchedule|PreferNoSchedule|NoExecute(1.6 only)"
---
> - operator: Equal
> key: "key"
> value: "cilium"
> effect: "NoExecute"
2498c2529
< privileged: false
---
> privileged: true
2539c2570
< useDigest: true
---
> useDigest: false
2550c2581,2583
< extraEnv: []
---
> extraEnv:
> - name: KUBERNETES_SERVICE_HOST
> value: hh-k8s-noah-sc-staging001-master.api.vip.com
2570a2604
> kubernetes.io/hostname: 10.189.212.125
2586,2589c2620,2623
< # - key: "key"
< # operator: "Equal|Exists"
< # value: "value"
< # effect: "NoSchedule|PreferNoSchedule|NoExecute(1.6 only)"
---
> - operator: Equal
> key: "key"
> value: "cilium"
> effect: "NoExecute"
2689c2723
< useDigest: true
---
> useDigest: false
2699c2733
< useDigest: true
---
> useDigest: false
2736c2770
< useDigest: true
---
> useDigest: false
2743c2777,2779
< extraEnv: []
---
> extraEnv:
> - name: KUBERNETES_SERVICE_HOST
> value: hh-k8s-noah-sc-staging001-master.api.vip.com
2798c2834,2836
< extraEnv: []
---
> extraEnv:
> - name: KUBERNETES_SERVICE_HOST
> value: hh-k8s-noah-sc-staging001-master.api.vip.com
2864a2903
> kubernetes.io/hostname: 10.189.212.125
2868c2907,2911
< tolerations: []
---
> tolerations:
> - operator: Equal
> key: "key"
> value: "cilium"
> effect: "NoExecute"
3138c3181,3185
< tolerations: []
---
> tolerations:
> - operator: Equal
> key: "key"
> value: "cilium"
> effect: "NoExecute"
3167c3214,3218
< tolerations: []
---
> tolerations:
> - operator: Equal
> key: "key"
> value: "cilium"
> effect: "NoExecute"
另外还需要部署 Grafana 和 Prometheus 来验证指标收集的效果。
k apply -f https://raw.githubusercontent.com/cilium/cilium/HEAD/examples/kubernetes/addons/prometheus/monitoring-example.yaml
最终部署的结果如下。
# k get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE
cilium-c987r 1/1 Running 0 3h22m 10.189.212.125 10.189.212.125
cilium-operator-7df8cb69b8-2h4gm 1/1 Running 0 3h22m 10.189.212.125 10.189.212.125
hubble-ui-7b4bcf6bcf-d4fpb 2/2 Running 0 3h22m 10.189.82.106 10.189.212.125
# k get po -n cilium-monitoring -o wide
NAME READY STATUS RESTARTS AGE IP NODE
grafana-7457fdc76-xhg8l 1/1 Running 0 3h7m 10.189.83.14 10.189.212.125
prometheus-547b7d9856-zl8lp 1/1 Running 0 3h7m 10.189.83.12 10.189.212.125
查看Dashboard
指标Label
所有的指标,如果没有合适的 Label,就无法精准表示指标的含义了,但是大量的 Label 会增加存储容量的需求,需要根据需求,适当设计。
默认的安装的方法里,配置 Hubble 指标主要在下面的地方,除了按照 dns, drop, tcp 配置外,如果需要配置上流量的上下文,还需要配置一些特殊的标记,具体参考 Hubble Metrics。
hubble:
enabled: true
metrics:
enabled:
- dns:query;ignoreAAAA
- drop
- tcp
- flow:destinationContext=dns|ip
- icmp
- http
根据以上的配置,flow:destinationContext=dns|ip
,将会在 flow 的指标上添加目标上下文的信息,如果有域名就填域名,没有就是 IP,最终的指标如下。
hubble_flows_processed_total{destination="10.189.94.59",protocol="TCP",subtype="to-stack",type="Trace",verdict="FORWARDED"} 159
hubble_flows_processed_total{destination="10.190.135.235",protocol="TCP",subtype="to-stack",type="Trace",verdict="FORWARDED"} 159
hubble_flows_processed_total{destination="10.190.56.61",protocol="TCP",subtype="to-stack",type="Trace",verdict="FORWARDED"} 1
Service Map
官方的 Hubble Grafana 插件是收费的,所以如果需要做 Service Map 的话,需要开发一个将 Hubble 的指标转成 Node Graph 插件要求的格式的转换的插件。