pilot-agent& Envoy 启动流程

istio-init

istio-init init 容器用于设置 iptables 规则,以便将入站/出站流量通过 sidecar 代理。初始化容器与应用程序容器在以下方面有所不同:

  • 它在启动应用容器之前运行,并一直运行直至完成。
  • 如果有多个初始化容器,则每个容器都应在启动下一个容器之前成功完成

我们可以看下sleep对应的pod

kubectl describe pod sleep-54f94cbff5-jmwtf
Name:         sleep-54f94cbff5-jmwtf
Namespace:    default
Priority:     0
Node:         minikube/172.17.0.3
Start Time:   Wed, 27 May 2020 12:14:08 +0800
Labels:       app=sleep
              istio.io/rev=
              pod-template-hash=54f94cbff5
              security.istio.io/tlsMode=istio
Annotations:  sidecar.istio.io/interceptionMode: REDIRECT
              sidecar.istio.io/status:
                {"version":"d36ff46d2def0caba37f639f09514b17c4e80078f749a46aae84439790d2b560","initContainers":["istio-init"],"containers":["istio-proxy"]...
              traffic.sidecar.istio.io/excludeInboundPorts: 15020
              traffic.sidecar.istio.io/includeOutboundIPRanges: *
Status:       Running
IP:           172.18.0.11
IPs:
  IP:           172.18.0.11
Controlled By:  ReplicaSet/sleep-54f94cbff5
Init Containers:
  istio-init:
    Container ID:  docker://f5c88555b666c18e5aa343b3f452355f96d66dc4268fa306f93432e0f98c3950
    Image:         docker.io/istio/proxyv2:1.6.0
    Image ID:      docker-pullable://istio/proxyv2@sha256:821cc14ad9a29a2cafb9e351d42096455c868f3e628376f1d0e1763c3ce72ca6
    Port:          <none>
    Host Port:     <none>
    Args:
      istio-iptables
      -p
      15001
      -z
      15006
      -u
      1337
      -m
      REDIRECT
      -i
      *
      -x

      -b
      *
      -d
      15090,15021,15020
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 27 May 2020 12:14:12 +0800
      Finished:     Wed, 27 May 2020 12:14:13 +0800
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     100m
      memory:  50Mi
    Requests:
      cpu:     10m
      memory:  10Mi
    Environment:
      DNS_AGENT:  
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from sleep-token-zq2wv (ro)
Containers:
  sleep:
    Container ID:  docker://a5437e12f6ea25d828531ba0dc4fab78374d5e9f746b6a199c4ed03b5d53c8f7
    Image:         governmentpaas/curl-ssl
    Image ID:      docker-pullable://governmentpaas/curl-ssl@sha256:b8d0e024380e2a02b557601e370be6ceb8b56b64e80c3ce1c2bcbd24f5469a23
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/sleep
      3650d
    State:          Running
      Started:      Wed, 27 May 2020 12:14:14 +0800
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /etc/sleep/tls from secret-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from sleep-token-zq2wv (ro)
  istio-proxy:
    Container ID:  docker://d03a43d3f257c057b664cf7ab03bcd301799a9e849da35fe54fdb0c9ea5516a4
    Image:         docker.io/istio/proxyv2:1.6.0
    Image ID:      docker-pullable://istio/proxyv2@sha256:821cc14ad9a29a2cafb9e351d42096455c868f3e628376f1d0e1763c3ce72ca6
    Port:          15090/TCP
    Host Port:     0/TCP
    Args:
      proxy
      sidecar
      --domain
      $(POD_NAMESPACE).svc.cluster.local
      --serviceCluster
      sleep.$(POD_NAMESPACE)
      --proxyLogLevel=warning
      --proxyComponentLogLevel=misc:error
      --trust-domain=cluster.local
      --concurrency
      2
    State:          Running
      Started:      Wed, 27 May 2020 12:14:17 +0800
    Ready:          True
    Restart Count:  0

从输出中可以看出,istio-init 容器的 StateTerminated,而 ReasonCompleted。只有两个容器是运行的,主应用程序 curl-ssl 容器和 istio-proxyv2 容器。

让我们格式化istio-init 对应的 Args 参数,发现它执行了如下命令

istio-iptables -p 15001 -z 15006 -u 1337 -m REDIRECT -i * -x  -b * -d 15090,15021,15020

可以看到 istio-init 容器的入口是 istio-iptables 命令行, 它是一个go编译出来的二进制文件,该二进制文件会调用iptables命令创建了一些列iptables规则来劫持Pod中的流量。命令行工具源码入口在 tools/istio-iptables/main.go 中。接下来我们看看它具体操作的iptables规则有哪些。

本文运行在minikube上,因为istio-init容器在初始化完成后就会退出,所以是没办法直接登入该容器的。但是它应用的iptables的规则会在同一Pod内其他容器上看到,我们可以登录该Pod其他容器查看对应的规则,执行命令如下:

进入 minikube 并切换为 root 用户

minikube ssh
sudo -i

查看sleep应用相关的容器

docker ps | grep sleep

d03a43d3f257        istio/proxyv2              "/usr/local/bin/pilo…"   2 hours ago         Up 2 hours                              k8s_istio-proxy_slee-54f94cbff5-jmwtf_default_70c72535-cbfb-4201-af07-feb0948cc0c6_0
a5437e12f6ea        8c797666f87b               "/bin/sleep 3650d"       2 hours ago         Up 2 hours                              k8s_sleep_sleep-54f94cbff5-jmwtf_default_70c72535-cbfb-4201-af07-feb0948cc0c6_0
efdbb69b77c0        k8s.gcr.io/pause:3.2       "/pause"                 2 hours ago         Up 2 hours                              k8s_POD_sleep-54f94cbff5-jmwtf_default_70c72535-cbfb-4201-af07-feb0948cc0c6_0

挑选上述容器中的其中一个,查看其进程ID,这里8533为其进程ID。这里如果直接进入其docker容器执行ssh是无法获取到其iptables规则的,因为其权限不足。

iptables -t nat -L -v

iptables v1.6.1: can't initialize iptables table `nat': Permission denied (you must be root)
Perhaps iptables or your kernel needs to be upgraded.

需要通过nsenter提权查看其对应规则,nsenter命令详解

docker inspect efdbb69b77c0 --format '{{ .State.Pid }}'
8533

nsenter -t 8533 -n iptables -t nat -S

-P PREROUTING ACCEPT
-P INPUT ACCEPT
-P OUTPUT ACCEPT
-P POSTROUTING ACCEPT
-N ISTIO_INBOUND
-N ISTIO_IN_REDIRECT
-N ISTIO_OUTPUT
-N ISTIO_REDIRECT
-A PREROUTING -p tcp -j ISTIO_INBOUND
-A OUTPUT -p tcp -j ISTIO_OUTPUT
-A ISTIO_INBOUND -p tcp -m tcp --dport 22 -j RETURN
-A ISTIO_INBOUND -p tcp -m tcp --dport 15090 -j RETURN
-A ISTIO_INBOUND -p tcp -m tcp --dport 15021 -j RETURN
-A ISTIO_INBOUND -p tcp -m tcp --dport 15020 -j RETURN
-A ISTIO_INBOUND -p tcp -j ISTIO_IN_REDIRECT
-A ISTIO_IN_REDIRECT -p tcp -j REDIRECT --to-ports 15006
-A ISTIO_OUTPUT -s 127.0.0.6/32 -o lo -j RETURN
-A ISTIO_OUTPUT ! -d 127.0.0.1/32 -o lo -m owner --uid-owner 1337 -j ISTIO_IN_REDIRECT
-A ISTIO_OUTPUT -o lo -m owner ! --uid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -m owner --uid-owner 1337 -j RETURN
-A ISTIO_OUTPUT ! -d 127.0.0.1/32 -o lo -m owner --gid-owner 1337 -j ISTIO_IN_REDIRECT
-A ISTIO_OUTPUT -o lo -m owner ! --gid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -m owner --gid-owner 1337 -j RETURN
-A ISTIO_OUTPUT -d 127.0.0.1/32 -j RETURN
-A ISTIO_OUTPUT -j ISTIO_REDIRECT
-A ISTIO_REDIRECT -p tcp -j REDIRECT --to-ports 15001

查看 NAT 表中规则配置的详细信息

nsenter -t 8533 -n iptables -t nat -L -v
Chain PREROUTING (policy ACCEPT 3435 packets, 206K bytes)
 pkts bytes target     prot opt in     out     source               destination         
 3435  206K ISTIO_INBOUND  tcp  --  any    any     anywhere             anywhere            

Chain INPUT (policy ACCEPT 3435 packets, 206K bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain OUTPUT (policy ACCEPT 599 packets, 54757 bytes)
 pkts bytes target     prot opt in     out     source               destination         
   22  1320 ISTIO_OUTPUT  tcp  --  any    any     anywhere             anywhere            

Chain POSTROUTING (policy ACCEPT 599 packets, 54757 bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain ISTIO_INBOUND (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 RETURN     tcp  --  any    any     anywhere             anywhere             tcp dpt:22
    1    60 RETURN     tcp  --  any    any     anywhere             anywhere             tcp dpt:15090
 3434  206K RETURN     tcp  --  any    any     anywhere             anywhere             tcp dpt:15021
    0     0 RETURN     tcp  --  any    any     anywhere             anywhere             tcp dpt:15020
    0     0 ISTIO_IN_REDIRECT  tcp  --  any    any     anywhere             anywhere            

Chain ISTIO_IN_REDIRECT (3 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 REDIRECT   tcp  --  any    any     anywhere             anywhere             redir ports 15006

Chain ISTIO_OUTPUT (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 RETURN     all  --  any    lo      127.0.0.6            anywhere            
    0     0 ISTIO_IN_REDIRECT  all  --  any    lo      anywhere            !localhost            owner UID match 1337
    0     0 RETURN     all  --  any    lo      anywhere             anywhere             ! owner UID match 1337
   22  1320 RETURN     all  --  any    any     anywhere             anywhere             owner UID match 1337
    0     0 ISTIO_IN_REDIRECT  all  --  any    lo      anywhere            !localhost            owner GID match 1337
    0     0 RETURN     all  --  any    lo      anywhere             anywhere             ! owner GID match 1337
    0     0 RETURN     all  --  any    any     anywhere             anywhere             owner GID match 1337
    0     0 RETURN     all  --  any    any     anywhere             localhost           
    0     0 ISTIO_REDIRECT  all  --  any    any     anywhere             anywhere            

Chain ISTIO_REDIRECT (1 references)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 REDIRECT   tcp  --  any    any     anywhere             anywhere             redir ports 15001

关于 iptables 规则请参考 iptables 命令

回过头来看下对应go源码

tools/istio-iptables/pkg/constants/constants.go

// Constants for iptables commands
const (
    IPTABLES         = "iptables"
    IPTABLESRESTORE  = "iptables-restore"
    IPTABLESSAVE     = "iptables-save"
    IP6TABLES        = "ip6tables"
    IP6TABLESRESTORE = "ip6tables-restore"
    IP6TABLESSAVE    = "ip6tables-save"
    IP               = "ip"
)

// iptables tables
const (
    MANGLE = "mangle"
    NAT    = "nat"
    FILTER = "filter"
)

// Built-in iptables chains
const (
    INPUT       = "INPUT"
    OUTPUT      = "OUTPUT"
    FORWARD     = "FORWARD"
    PREROUTING  = "PREROUTING"
    POSTROUTING = "POSTROUTING"
)

......

tools/istio-iptables/pkg/cmd/root.go

var rootCmd = &cobra.Command{
    Use:   "istio-iptables",
    Short: "Set up iptables rules for Istio Sidecar",
    Long:  "Script responsible for setting up port forwarding for Istio sidecar.",
    Run: func(cmd *cobra.Command, args []string) {
        cfg := constructConfig()
        var ext dep.Dependencies
        if cfg.DryRun {
            ext = &dep.StdoutStubDependencies{}
        } else {
            ext = &dep.RealDependencies{}
        }

        iptConfigurator := NewIptablesConfigurator(cfg, ext)
        if !cfg.SkipRuleApply {
            // 规则执行的入口
            iptConfigurator.run()
        }
    }
}

func (iptConfigurator *IptablesConfigurator) run() {

    iptConfigurator.logConfig()

    // ...此处省略1万字...

    // Create a new chain for redirecting outbound traffic to the common Envoy port.
    // In both chains, '-j RETURN' bypasses Envoy and '-j ISTIOREDIRECT'
    // redirects to Envoy.
    iptConfigurator.iptables.AppendRuleV4(
        constants.ISTIOREDIRECT, constants.NAT, "-p", constants.TCP, "-j", constants.REDIRECT, "--to-ports", iptConfigurator.cfg.ProxyPort)
    // Use this chain also for redirecting inbound traffic to the common Envoy port
    // when not using TPROXY.

    iptConfigurator.iptables.AppendRuleV4(constants.ISTIOINREDIRECT, constants.NAT, "-p", constants.TCP, "-j", constants.REDIRECT,
        "--to-ports", iptConfigurator.cfg.InboundCapturePort)

    iptConfigurator.handleInboundPortsInclude()

    // TODO: change the default behavior to not intercept any output - user may use http_proxy or another
    // iptablesOrFail wrapper (like ufw). Current default is similar with 0.1
    // Jump to the ISTIOOUTPUT chain from OUTPUT chain for all tcp traffic.
    iptConfigurator.iptables.AppendRuleV4(constants.OUTPUT, constants.NAT, "-p", constants.TCP, "-j", constants.ISTIOOUTPUT)
    // Apply port based exclusions. Must be applied before connections back to self are redirected.
    if iptConfigurator.cfg.OutboundPortsExclude != "" {
        for _, port := range split(iptConfigurator.cfg.OutboundPortsExclude) {
            iptConfigurator.iptables.AppendRuleV4(constants.ISTIOOUTPUT, constants.NAT, "-p", constants.TCP, "--dport", port, "-j", constants.RETURN)
        }
    }

    // 127.0.0.6 is bind connect from inbound passthrough cluster
    iptConfigurator.iptables.AppendRuleV4(constants.ISTIOOUTPUT, constants.NAT, "-o", "lo", "-s", "127.0.0.6/32", "-j", constants.RETURN)

    // Skip redirection for Envoy-aware applications and
    // container-to-container traffic both of which explicitly use
    // localhost.
    iptConfigurator.iptables.AppendRuleV4(constants.ISTIOOUTPUT, constants.NAT, "-d", "127.0.0.1/32", "-j", constants.RETURN)
    // Apply outbound IPv4 exclusions. Must be applied before inclusions.
    for _, cidr := range ipv4RangesExclude.IPNets {
        iptConfigurator.iptables.AppendRuleV4(constants.ISTIOOUTPUT, constants.NAT, "-d", cidr.String(), "-j", constants.RETURN)
    }

    // ...此处省略1万字...

    // 真正执行iptables的方法
    iptConfigurator.executeCommands()
}

iptConfigurator.executeCommands() 方法执行最终可以跟踪到tools/istio-iptables/pkg/dependencies/implementation.go中,可以看到就是利用的go 的命令行执行工具exec.Command来执行的os系统命令。

func (r *RealDependencies) execute(cmd string, redirectStdout bool, args ...string) error {
    //执行真正的iptables命令
    externalCommand := exec.Command(cmd, args...)
    externalCommand.Stdout = os.Stdout
    //TODO Check naming and redirection logic
    if !redirectStdout {
        externalCommand.Stderr = os.Stderr
    }
    return externalCommand.Run()
}

执行此命令后,istio-init就完成了它的使命。

iptables 进行流量拦截的部分单独一篇文章来写。

istio-proxy

通过开篇我们可以看到还有istio-proxy这个容器

 Image:         docker.io/istio/proxyv2:1.6.0
    Image ID:      docker-pullable://istio/proxyv2@sha256:821cc14ad9a29a2cafb9e351d42096455c868f3e628376f1d0e1763c3ce72ca6
    Port:          15090/TCP
    Host Port:     0/TCP
    Args:
      proxy
      sidecar
      --domain
      $(POD_NAMESPACE).svc.cluster.local
      --serviceCluster
      sleep.$(POD_NAMESPACE)
      --proxyLogLevel=warning
      --proxyComponentLogLevel=misc:error
      --trust-domain=cluster.local
      --concurrency
      2
    State:          Running

我们可以通过dockerhub 查看改镜像的内容 https://hub.docker.com/r/istio/proxyv2/tags

这里我们一起看看对应镜像1.6.0版本对应的Dockerfile传送门 . 它在istio源码的位置在pilot/docker/Dockerfile.proxyv2

ADD file:c3e6bb316dfa6b81dd4478aaa310df532883b1c0a14edeec3f63d641980c1789 in /

/bin/sh -c [ -z "$(apt-get indextargets)" ]
/bin/sh -c mkdir -p /run/systemd && echo 'docker' > /run/systemd/container
CMD ["/bin/bash"]
ENV DEBIAN_FRONTEND=noninteractive

// ...此处省略1万字...
COPY envoy /usr/local/bin/envoy
COPY pilot-agent /usr/local/bin/pilot-agent

ENTRYPOINT ["/usr/local/bin/pilot-agent"]

我们看到里面将envoy,pilot-agent程序添加进proxyv2容器,并执行pilot-agent作为启动命令,我们合并器执行参数,得出如下命令:

pilot-agent proxy sidecar --domain default.svc.cluster.local --serviceCluster sleep.default --proxyLogLevel=warning --proxyComponentLogLevel=misc:error --trust-domain=cluster.local --concurrency 2

那么我们接着看看该命令执行后会做什么操作呢?参考上面的操作步骤

minikube ssh
sudo -i
docker ps |grep sleep

d03a43d3f257        istio/proxyv2              "/usr/local/bin/pilo…"   3 hours ago         Up 3 hours                              k8s_istio-proxy_slee-54f94cbff5-jmwtf_default_70c72535-cbfb-4201-af07-feb0948cc0c6_0
a5437e12f6ea        8c797666f87b               "/bin/sleep 3650d"       3 hours ago         Up 3 hours                              k8s_sleep_sleep-54f94cbff5-jmwtf_default_70c72535-cbfb-4201-af07-feb0948cc0c6_0
efdbb69b77c0        k8s.gcr.io/pause:3.2       "/pause"                 3 hours ago         Up 3 hours                              k8s_POD_sleep-54f94cbff5-jmwtf_default_70c72535-cbfb-4201-af07-feb0948cc0c6_0

这次我们需要制定进入proxyv2容器d03a43d3f257并查看其内部运行的进程

docker exec -it d03a43d3f257 /bin/bash
ps -ef | grep sleep

UID        PID  PPID  C STIME TTY          TIME CMD
istio-p+     1     0  0 04:14 ?        00:00:06 /usr/local/bin/pilot-agent proxy sidecar --domain default.svc.cluster.local --serviceCluster sleep.default --proxyLogLevel=warning --proxyComponentLogLevel=misc:error --trust-domain=cluster.local --concurrency 2

istio-p+    17     1  0 04:14 ?        00:00:26 /usr/local/bin/envoy -c etc/istio/proxy/envoy-rev0.json --restart-epoch 0 --drain-time-s 45 --parent-shutdown-time-s 60 --service-cluster sleep.default --service-node sidecar~172.18.0.11~sleep-54f94cbff5-jmwtf.default~default.svc.cluster.local --max-obj-name-len 189 --local-address-ip-version v4 --log-format %Y-%m-%dT%T.%fZ.%l.envoy %n.%v -l warning --component-log-level misc:error --concurrency 2

观察PID与PPID可以发现,pilot-agent执行后启动了envoy程序。

pilot-agent命令源码入口在pilot/cmd/pilot-agent/main.go,该命令的用法可以查阅pilot-agent命令

proxyCmd = &cobra.Command{
        Use:   "proxy",
        Short: "Envoy proxy agent",
        RunE: func(c *cobra.Command, args []string) error {
            // ...此处省略1万字...

            proxyConfig, err := constructProxyConfig()
            if out, err := gogoprotomarshal.ToYAML(&proxyConfig); err != nil {
                log.Infof("Failed to serialize to YAML: %v", err)

            // ...此处省略1万字...

            envoyProxy := envoy.NewProxy(envoy.ProxyConfig{
                Config:              proxyConfig,
                Node:                role.ServiceNode(),
                LogLevel:            proxyLogLevel,
                ComponentLogLevel:   proxyComponentLogLevel,
                PilotSubjectAltName: pilotSAN,
                MixerSubjectAltName: mixerSAN,
                NodeIPs:             role.IPAddresses,
                PodName:             podName,
                PodNamespace:        podNamespace,
                PodIP:               podIP,
                STSPort:             stsPort,
                ControlPlaneAuth:    proxyConfig.ControlPlaneAuthPolicy == meshconfig.AuthenticationPolicy_MUTUAL_TLS,
                DisableReportCalls:  disableInternalTelemetry,
                OutlierLogPath:      outlierLogPath,
                PilotCertProvider:   pilotCertProvider,
                ProvCert:            citadel.ProvCert,
            })

            agent := envoy.NewAgent(envoyProxy, features.TerminationDrainDuration())

            // 监控envoy启动直至启动成功,启动逻辑在`agent.Restart`中
            watcher := envoy.NewWatcher(tlsCerts, agent.Restart)
            go watcher.Run(ctx)

            return agent.Run(ctx)
        },
    }
)

agent.Restart方法

func (a *agent) Restart(config interface{}) {
    // 同一时刻只允许一个envoy agent执行启动
    a.restartMutex.Lock()
    defer a.restartMutex.Unlock()

    if reflect.DeepEqual(a.currentConfig, config) {
        // 如果配置文件没有发生变更那么什么都不用做,直接退出
        a.mutex.Unlock()
        return
    }

    // 如果监控到配置文件发生了变更,那么epoch版本号+1,创建新的envoy 实例
    epoch := a.currentEpoch + 1
    log.Infof("Received new config, creating new Envoy epoch %d", epoch)

    // 启动一个新的协程启动envoy
    go a.runWait(config, epoch, abortCh)
}

go a.runWait(config, epoch, abortCh)方法

func (a *agent) runWait(config interface{}, epoch int, abortCh <-chan error) {
    // 直接调用proxy实例启动,等待proxy启动完成
    err := a.proxy.Run(config, epoch, abortCh)
    a.proxy.Cleanup(epoch)
    a.statusCh <- exitStatus{epoch: epoch, err: err}
}

proxy.Run 方法

func (e *envoy) Run(config interface{}, epoch int, abort <-chan error) error {
    var fname string
    // 如果启动参数指定了自定义的配置文件那么使用自定义的配置文件,否则使用默认的配置
    if len(e.Config.CustomConfigFile) > 0 {
        fname = e.Config.CustomConfigFile
    } else {
        // 这里创建envoy 启动需要的/etc/istio/proxy/envoy-rev0.json 配置文件
        // 其中的0这个参数会随着重启的次数跟着+1变动,但仅仅是文件名发生变动,里面的配置内容还是一样
        out, err := bootstrap.New(bootstrap.Config{
            Node:                e.Node,
            Proxy:               &e.Config,
            PilotSubjectAltName: e.PilotSubjectAltName,
            MixerSubjectAltName: e.MixerSubjectAltName,
            LocalEnv:            os.Environ(),
            NodeIPs:             e.NodeIPs,
            PodName:             e.PodName,
            PodNamespace:        e.PodNamespace,
            PodIP:               e.PodIP,
            STSPort:             e.STSPort,
            ControlPlaneAuth:    e.ControlPlaneAuth,
            DisableReportCalls:  e.DisableReportCalls,
            OutlierLogPath:      e.OutlierLogPath,
            PilotCertProvider:   e.PilotCertProvider,
            ProvCert:            e.ProvCert,
        }).CreateFileForEpoch(epoch)
        fname = out
    }

    // ...此处省略1万字...

    // envoy 启动需要的参数
    // 也就是 --restart-epoch 0 --drain-time-s 45 --parent-shutdown-time-s 60...这部分内容
    args := e.args(fname, epoch, istioBootstrapOverrideVar.Get())

    // 很熟悉的味道,调用系统命令启动envoy
    // e.Config.BinaryPath 参数值为 /usr/local/bin/envoy,
    // 相关默认常量值可以查阅 pkg/config/constants/constants.go 这个源文件
    cmd := exec.Command(e.Config.BinaryPath, args...)

    // ...此处省略1万字...
}

整个启动过程其实挺复杂的,这里只是分析了最基础的启动envoy的流程。如果细看里面还包括

  1. SDS的启动
  2. polit 度量指标服务启动
  3. 监控配置更新后热启动envoy的流程
  4. 收到系统kill命令优雅退出envoy的流程
    从上述可看出pilot-agent中包含两个进程pilot-agent和envoy,envoy是真正实现Sidecar机制的进程,实现服务治理策略、路由转发等功能。pilot-agent主要是负责启动istio-proxy,除了启动istio-proxy外,还具有如下功能:生成envoy的Bootstrap配置文件、进行envoy的健康检查、监视证书的变化,通知envoy进程热重启,实现证书的热加载、提供envoy守护功能,当envoy异常退出的时候重启envoy。下面我们看下istio-proxy的启动过程。

首先看下agent的配置信息

type Agent struct {
  // 配置envoy中的信息,包括envoy运行文件、envoy启动参数等,具体可通过kubectl get cm istio -n istio-system -oyaml查看istio中的全局配置信息
  proxyConfig *mesh.ProxyConfig
//envoy运行时需要的一些配置参数
  envoyOpts envoy.ProxyConfig
//envoy agent实例
  envoyAgent  *envoy.Agent
//envoy管道,进行错误处理
  envoyWaitCh chan error  
// SDS服务器,用于工作负载的证书申请,envoy向pilot-agent申请证书和私钥,pilot-agent生成私钥和证书后向istiod发送证书签发请求,istiod根据请求中的服务信息为pilot-agent签发证书,将证书返回给pilot-agent,pilot-agent再将证书和私钥返回给envoy用于后面的envoy间的通信认证
  sdsServer *sds.Server
  // 用于SDS证书签证,可以通过文件的形式进行签证
  // 默认使用istiod对工作负载进行签证
  secretCache *cache.SecretManagerClient
  //xdsproxy用于istiod与envoy之间通信的渠道,istiod生成配置后通过conn连接传送给xdsproxy,xdsproxy接收istiod传来的数据后进行判断转发给envoy,envoy对配置信息进行处理
  xdsProxy *XdsProxy
  // 证书监听器,监听证书更新事件然后触发证书更新策略
  // 主要是获取证书然后重新进行签证生成envoy配置下发
  caFileWatcher filewatcher.FileWatcher
}

后面通过wait, err := agent.Run(ctx)进行pilot-agent的启动

func (a *Agent) Run(ctx context.Context) (func(), error) {
  if socketExists {
    log.Info("SDS socket found. Istio SDS Server won't be started")
  } else {
    log.Info("SDS socket not found. Starting Istio SDS Server")
    //创建SDS服务器用于envoy证书的申请
    err = a.initSdsServer()
    if err != nil {
      return nil, fmt.Errorf("failed to start SDS server: %v", err)
    }
  }
  //进行一些proxy参数的赋值,包括istiod的ip、pod的ip等
  //核心组件,用于envoy服务发现以及与istiod之间的通信
  a.xdsProxy, err = initXdsProxy(a)
  //获取CA根证书
  rootCAForXDS, err := a.FindRootCAForXDS()
  if err != nil {
    return nil, fmt.Errorf("failed to find root XDS CA: %v", err)
  }
  //添加CA证书的监听机制,进行证书的动态更新
  go a.caFileWatcherHandler(ctx, rootCAForXDS)
  if !a.EnvoyDisabled() {
    //初始化envoy相关配置,包括envoy启动的配置路径、端口、并发数,基本都是和proxyConfig一致
    err = a.initializeEnvoyAgent(ctx)
    go func() {
      defer a.wg.Done()
      if a.cfg.EnableDynamicBootstrap {
        start := time.Now()
        var err error
        select {
        case err = <-a.envoyWaitCh:
        case <-ctx.Done():
          // Early cancellation before envoy started.
          return
        }
        if err != nil {
          log.Errorf("failed to write updated envoy bootstrap: %v", err)
          return
        }
        log.Infof("received server-side bootstrap in %v", time.Since(start))
      }
      //启动envoy
      a.envoyAgent.Run(ctx)
    }()
  } else if a.WaitForSigterm() {
    // wait for SIGTERM and perform graceful shutdown
    a.wg.Add(1)
    go func() {
      defer a.wg.Done()
      <-ctx.Done()
    }()
  }
  return a.wg.Wait, nil
}

在这里介绍下istiod中的安全机制如下图所示


image.png

0、在istiod初始化的时候会通过dicovery的maybeCreateCA方法创建istiod的CA根证书,该CA服务器负责为网格中的各个服务签发证书
1、envoy向pilot-agent发起SDS请求,要求获取自己的证书和私钥
2、pilot-agent生成私钥和CSR,向istiod发送证书签发请求
3、istiod根据请求中服务的sa进行身份认证,认证通过后,为其签发证书,将证书返回给pilot-agent
4、pilot-agent将证书和私钥通过SDS接口返回给envoy
5、istiod通过apiserver把自己的CA根证书通过configmap挂载到每个pod中
6、这样当两个envoy通信的时候,可以通过envoy中的私钥和挂载到pilot-agent中istiod的CA根证书进行双向认证

下面介绍下envoy与istio之间进行配置请求与信息相应

func _AggregatedDiscoveryService_StreamAggregatedResources_Handler(srv interface{}, stream grpc.ServerStream) error {
  return srv.(AggregatedDiscoveryServiceServer).StreamAggregatedResources(&aggregatedDiscoveryServiceStreamAggregatedResourcesServer{stream})
}
// Every time envoy makes a fresh connection to the agent, we reestablish a new connection to the upstream xds
// This ensures that a new connection between istiod and agent doesn't end up consuming pending messages from envoy
// as the new connection may not go to the same istiod. Vice versa case also applies.
func (p *XdsProxy) StreamAggregatedResources(downstream discovery.AggregatedDiscoveryService_StreamAggregatedResourcesServer) error {
  proxyLog.Debugf("accepted XDS connection from Envoy, forwarding to upstream XDS server")
  return p.handleStream(downstream)
}
//处理来自envoy的请求
func (p *XdsProxy) handleStream(downstream adsStream) error {
  con := &ProxyConnection{
    conID:           connectionNumber.Inc(),
    upstreamError:   make(chan error, 2), // can be produced by recv and send
    downstreamError: make(chan error, 2), // can be produced by recv and send
    requestsChan:    make(chan *discovery.DiscoveryRequest, 10),
    responsesChan:   make(chan *discovery.DiscoveryResponse, 10),
    stopChan:        make(chan struct{}),
    downstream:      downstream,
  }
  //赋值给xdsproxy
  p.RegisterStream(con)
  defer p.UnregisterStream(con)
  ctx, cancel := context.WithTimeout(context.Background(), time.Second*5)
  defer cancel()
  //创建与istiod的连接
  upstreamConn, err := p.buildUpstreamConn(ctx)
  //创建与istiod通信的xds客户端
  xds := discovery.NewAggregatedDiscoveryServiceClient(upstreamConn)
  ctx = metadata.AppendToOutgoingContext(context.Background(), "ClusterID", p.clusterID)
  for k, v := range p.xdsHeaders {
    ctx = metadata.AppendToOutgoingContext(ctx, k, v)
  }
  // We must propagate upstream termination to Envoy. This ensures that we resume the full XDS sequence on new connection
  return p.HandleUpstream(ctx, con, xds)
}
//envoy与istiod之间的通信处理
func (p *XdsProxy) HandleUpstream(ctx context.Context, con *ProxyConnection, xds discovery.AggregatedDiscoveryServiceClient) error {
  upstream, err := xds.StreamAggregatedResources(ctx,
    grpc.MaxCallRecvMsgSize(defaultClientMaxReceiveMessageSize))

  //处理envoy到istiod的请求信息
  go p.handleUpstreamRequest(con)
  //处理istiod到envoy的返回值
  go p.handleUpstreamResponse(con)
}

最终在envoy中启动了两个协程处理envoy与istiod之间的请求与相应

func (p *XdsProxy) handleUpstreamRequest(con *ProxyConnection) {
  initialRequestsSent := atomic.NewBool(false)
  go func() {
    for {
      // 接受envoy的数据
      req, err := con.downstream.Recv()
      if err != nil {
        select {
        case con.downstreamError <- err:
        case <-con.stopChan:
        }
        return
      }
      // 发送给istiod
      con.sendRequest(req)
    }
  }()
}
func (p *XdsProxy) handleUpstreamResponse(con *ProxyConnection) {
  for {
    select {
    //接受istiod传来的数据
    case resp := <-con.responsesChan:
      // TODO: separate upstream response handling from requests sending, which are both time costly
      proxyLog.Debugf("response for type url %s", resp.TypeUrl)
      metrics.XdsProxyResponses.Increment()
      //根据请求信息,进行相关转发处理
      if h, f := p.handlers[resp.TypeUrl]; f {
        if len(resp.Resources) == 0 {
          // Empty response, nothing to do
          // This assumes internal types are always singleton
          break
        }
        err := h(resp.Resources[0])
        var errorResp *google_rpc.Status
        if err != nil {
          errorResp = &google_rpc.Status{
            Code:    int32(codes.Internal),
            Message: err.Error(),
          }
        }
        // Send ACK/NACK
        con.sendRequest(&discovery.DiscoveryRequest{
          VersionInfo:   resp.VersionInfo,
          TypeUrl:       resp.TypeUrl,
          ResponseNonce: resp.Nonce,
          ErrorDetail:   errorResp,
        })
        continue
      }
      switch resp.TypeUrl {
      case v3.ExtensionConfigurationType:
        if features.WasmRemoteLoadConversion {
          // If Wasm remote load conversion feature is enabled, rewrite and send.
          go p.rewriteAndForward(con, resp)
        } else {
          //把数据发送给envoy
          forwardToEnvoy(con, resp)
        }
      }
    case <-con.stopChan:
      return
    }
  }
}

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 217,509评论 6 504
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 92,806评论 3 394
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 163,875评论 0 354
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 58,441评论 1 293
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 67,488评论 6 392
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 51,365评论 1 302
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,190评论 3 418
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 39,062评论 0 276
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 45,500评论 1 314
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,706评论 3 335
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,834评论 1 347
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,559评论 5 345
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 41,167评论 3 328
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,779评论 0 22
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,912评论 1 269
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,958评论 2 370
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,779评论 2 354

推荐阅读更多精彩内容