kubenetes 简洁总结

kubecon简洁总结

1，kubenets基本使用

[if !supportLists]l [endif]基本单元

Data:

--> metaData:

Pod(containers和volumes):

假设有两个pods：

部署kube-decon副本（RC）：

Image替换升级（根据service的selector选择对应service）：

Kube-decon服务部署升级完后架构：

最终访问URL：

[if !supportLists]l [endif]整体效果

K8s部署最终架构：

相关部署命令：

Kubectl create namespace kube-decon

Kubectl create -f resource.yaml:

Service配置文件（ClusterIP，NodePort，LoadBalancer等）：

Ingress配置文件：

2，kubenets集群使用

[if !supportLists]l [endif]Kube-apiserver（Master节点）

Kube-apiserver基于类似etcd的kv集群，存储各个资源的状态，从而实现了Restful的API。

[if !supportLists]l [endif]Kube-scheduler（Master节点）

监控node和pod的关系，负责调度pod到合适的Node上（即根据调度算法选择某个node），选择结果通过apiserver更新。

[if !supportLists]l [endif]Kube-controller-manager（master节点）

如果说APIServer做的是“前台”的工作的话，那controller manager就是负责“后台”的。每个资源一般都对应有一个控制器，而controller manager就是负责管理这些控制器的。

Namespcae：

将系统内部的对象（Pod/RC/Service）“分配”到不同的Namespace，默认为”default”，实现对用户的分组，即“多租户”管理。

[if !supportLists]l [endif]Kubelet（Node节点）

Kubelet是Master在每个Node节点上面的agent，是Node节点上面最重要的模块，它负责维护和管理该Node上面的所有容器。

[if !supportLists]l [endif]Kube-proxy（Node节点）

该模块实现了Kubernetes中的服务发现和反向代理功能。反向代理方面：kube-proxy支持TCP和UDP连接转发，默认基于Round Robin算法将客户端流量转发到与service对应的一组后端pod。服务发现方面，kube-proxy使用etcd的watch机制，监控集群中service和endpoint对象数据的动态变化，并且维护一个service到endpoint的映射关系。

3，Kubenets网络管理

[if !supportLists]l [endif]集群网络

同一集群基本原则：

[if !supportLists]1）[endif]容器间无需nat互通：

2）节点和容器见无需nat互通：

3）互通容器间用的ip地址和容器自身获取的IP是相同的。

[if !supportLists]l [endif]服务（Service）

Service是对一组提供相同功能的Pods的抽象，并为它们提供一个统一的入口。借助Service，应用可以方便的实现服务发现与负载均衡，并实现应用的零宕机升级。

[if !supportLists]1）[endif]cluserip方式：

，服务名称

集群内部访问：

，

访问过程（纯采用iptables来实现LB，是目前一般kube默认的方式）：

ClusterIP是默认类型，自动分配一个仅cluster内部可以访问的虚拟IP

[if !supportLists]2）[endif]NodePort方式

访问方式：

NodePort在ClusterIP基础上为Service在每台机器上绑定一个端口，这样就可以通过 <NodeIP>:NodePort 来访问该服务。

[if !supportLists]3）[endif]LoadBalance方式：

LoadBalance在NodePort的基础上，借助cloud provider（keepalived -cloud-provider， haproxy等通过容器IP对外提供4层和7层负载均衡服务）创建一个外部的负载均衡器，并将请求转发到:NodePort。

其他方式：ExternalName：将服务通过DNS CNAME记录方式转发到指定的域名（通过 spec.externlName 设定），用来将service转发到kubernetes集群外部的服务。

[if !supportLists]l [endif]云端管理（cloud）

1）kube-apiserver lb

一般情况采用利用haproxy/nginx来自建 LB作为apiserver入口。

[if !supportLists]2）[endif]ingress lb

Ingress是一种HTTP方式的路由转发机制，由Ingress Controller（运行在Node）和HTTP代理服务器组合而成。Ingress Controller实时监控Kubernetes API，实时更新HTTP代理服务器的转发规则，映射到将实际应答该请求的Pod。HTTP代理服务器有GCE Load-Balancer、HaProxy、Nginx等开源方案。

[if !supportLists]3，[endif]Kubenets的linux基础

[if !supportLists]l [endif]namespace

[if !supportLists]l [endif]Cgroup

Linux Cgroups (Control Groups ）提供了对一组进程及将来子进程的资源限制、控制和统计的能力，这些资源包括CPU、内存、存储、网络等。通过Cgroups ，可以方便地限制某个进程的资源占用，并且可以实时地监控进程的监控和统计信息。

[if !supportLists]l [endif]Union File System

Union File System，简称 UnionFS 是一种为 Linux FreeBSD NetBSD 操作系统设计的，把其他文件系统联合到一个联合挂载点的文件系统服务。

最上层b6a0b1349017只有14b大小，由如下命令创建：

注：ubuntu用的aufs，centos用的是device mapper，实际文件目录两者有差异。

[if !supportLists]l [endif]Docker

-->

其中CNI是k8s的网络插件，专注解决容器网络连接和容器销毁时的资源释放，提供一套框架，可以支持大量不同的网络模式，包括flannel和Calico模式等。

在Kubernetes创建Pod后CNI提供网络的过程主要分三个步骤：

[if !supportLists]n [endif]Kubelet runtime创建network namespace

[if !supportLists]n [endif]Kubelete触发CNI插件，指定网络类型（网络类型决定哪一个CNI plugin将会被使用）

[if !supportLists]n [endif]CNI插件将创建veth pair, 检查IPAM类型和数据，触发IPAM插件，获取空闲的IP地址并将地址分配给容器的网络接口

[if !supportLists]l [endif]Logging

命令Kubectl logs 执行过程：

<--->

[if !supportLists]4，[endif]Kubenets的高级特性

[if !supportLists]l [endif]Security Context

Security Context的目的是限制不可信容器的行为，保护系统和其他容器不受其影响。

Kubernetes提供了三种配置Security Context的方法：

[if !supportLists]n [endif]Container-level Security Context：仅应用到指定的容器

[if !supportLists]n [endif]Pod-level Security Context：应用到Pod内所有容器以及Volume

[if !supportLists]n [endif]Pod Security Policies（PSP）：应用到集群内部所有Pod以及Volume

如限制容器的host端口范围为8000-8080：

[if !supportLists]l [endif]Network Policy

Network Policy提供了基于策略的网络控制，用于隔离应用并减少攻击面。它使用标签选择器模拟传统的分段网络，并通过策略控制它们之间的流量以及来自外部的流量。

在使用Network Policy之前，需要注意:

[if !supportLists]n [endif]apiserver开启extensions/v1beta1/networkpolicies

[if !supportLists]n [endif]网络插件要支持Network Policy，如Calico、Romana、Weave Net和trireme等

策略可分为Namespace隔离和Pod隔离。

[if !supportLists]l [endif]Downward API

Downward API（在容器中获取POD 的基本信息）提供了两种方式用于将POD的信息注入到容器内部：

[if !supportLists]n [endif]环境变量：用于单个变量，可以将POD信息和容器信息直接注入容器内部。

[if !supportLists]n [endif]Volume挂载：将 POD 信息生成为文件，直接挂载到容器内部中去。

环境变量获取方式：

，通过pod的日志可以查看打印结果。

kubectl create -f dapi-test-pod.yamlkubectl logs dapi-test-pod

[if !supportLists]l [endif]ConfigMaps/Secrets保存配置文件

可以使用kubectl create configmap从文件、目录或者key-value字符串创建等创建ConfigMap：

kubectl create configmap special-config --from-literal=special.how=very |--from-env-file=config.env | --from-file=config/

将创建的ConfigMap直接挂载至Pod的/etc/config目录下，其中每一个key-value键值对都会生成一个文件，key为文件名，value为内容：

[if !supportLists]l [endif]ConfigMaps/Secrets用作环境变量或命令参数

首先创建：kubectl create configmap special-config --from-literal=special.how=very --from-literal=special.type=charm

kubectl create configmap env-config --from-literal=log_level=INFO

而后以环境变量方式引用：

[if !supportLists]l [endif]Affinity/Antiaffinity

亲和性策略（Affinity）能够提供比NodeSelector或者Taints（污点）更灵活丰富的调度方式，例如：

[if !supportLists]n [endif]丰富的匹配表达式（In, NotIn, Exists, DoesNotExist. Gt, and Lt）

[if !supportLists]n [endif]软约束和硬约束（Required/Preferred）

[if !supportLists]n [endif]以节点上的其他Pod作为参照物进行调度计算

亲和性策略分为NodeAffinityPriority策略（Node节点的label标签）和InterPodAffinityPriority策略（Pod的Label）。

匹配的表达式有：In, NotIn, Exists, DoesNotExist， Gt, and Lt等。

PodAntiAffinity（Pod反亲和性调度）：用于规定pod不可以和哪些pod部署在同一拓扑结构下，与podAffinity一起解决pod和pod之间的关系。

运行结果：

上图所示，Deployment,副本数为３，指定了反亲和规则如上所示，pod的label为app:store,那么pod调度的时候将不会调度到node上已经运行了label为app:store的pod了，这样就会使得Deployment的三副本分别部署在不同的host的node上，pod的亲和性满足requiredDuringSchedulingIgnoredDuringExecution中topologyKey=”kubernetes.io/hostname”,并且node上需要运行有app=store的label。

推荐：

参考书：

《Kubernetes权威指南第2版.pdf》

《自己动手写Docker-2017.7-有书签.pdf》

在线学习：

https://kubernetes.io/docs/tutorials/online-training/overview/

资源：