总体流程
- 拉取代码到指定位置,并切换分支
- 执行make build LOCAL_BUILD=true编译二进制文件
- 执行make image LOCAL_BUILD=true生成容器镜像
依赖分析
构建calico-node镜像依赖的镜像列表
- calico/go-build:v0.54
- registry.access.redhat.com/ubi8/ubi-minimal:8.4
- calico/bird:v0.3.3-182-g4b493986-amd64
- calico/bpftool:v5.3-amd64
- centos:8
- registry.access.redhat.com/ubi8/ubi-minimal:8.4
软件版本依赖
- go
- git
- gcc
在宿主机上调试容器内felix进程
1、 修改node代码的Makefile
// 去掉优化选项
GCFLAGS=-gcflags "all=-N -l"
// 增加gc编译选项
$(DOCKER_GO_BUILD_CGO) sh -c '$(GIT_CONFIG_SSH) go build -v -o $@ $(BUILD_FLAGS) $(GCFLAGS) $(LDFLAGS) ./cmd/calico-node/main.go'
// 强制写死cgo_enable=0,cgo_enable=1在调试的时候可能有问题,还未确认。
CGO_ENABLED=0
2、修改calico-node的ds,将readness和liveness关闭。
3、在宿主机上下载安装dlv,https://github.com/derekparker/delve
4、make build LOCAL_BUILD=true && make image LOCAL_BUILD=true
5、替换calico-node的镜像为新编译的镜像
6、宿主机执行dlv attach pgrep -f "calico-node -felix"
7、felix由runsv管理,长时间断点,runsv会重启felix,单独调试的话,可以将felix脱离runsv的管理。
[root@master-9 node]# dlv attach `pgrep -f "calico-node -felix"`
Type 'help' for list of commands.
(dlv) b neigh_linux.go:119
Breakpoint 1 set at 0x2297573 for github.com/vishvananda/netlink.(*Handle).neighAdd() /go/pkg/mod/github.com/vishvananda/netlink@v1.1.1-0.20210703095558-21f2c55a7727/neigh_linux.go:119
(dlv) c
> github.com/vishvananda/netlink.(*Handle).neighAdd() /go/pkg/mod/github.com/vishvananda/netlink@v1.1.1-0.20210703095558-21f2c55a7727/neigh_linux.go:119 (hits goroutine(788):1 total:1) (PC: 0x2297573)
114: return pkgHandle.neighAdd(neigh, mode)
115: }
116:
117: // NeighAppend will append an entry to FDB
118: // Equivalent to: `bridge fdb append...`
=> 119: func (h *Handle) neighAdd(neigh *Neigh, mode int) error {
120: req := h.newNetlinkRequest(unix.RTM_NEWNEIGH, mode|unix.NLM_F_ACK)
121: return neighHandle(neigh, req)
122: }
123:
124: // NeighDel will delete an IP address from a link device.
(dlv) c
> github.com/vishvananda/netlink.(*Handle).neighAdd() /go/pkg/mod/github.com/vishvananda/netlink@v1.1.1-0.20210703095558-21f2c55a7727/neigh_linux.go:119 (hits goroutine(788):2 total:2) (PC: 0x2297573)
114: return pkgHandle.neighAdd(neigh, mode)
115: }
116:
117: // NeighAppend will append an entry to FDB
118: // Equivalent to: `bridge fdb append...`
=> 119: func (h *Handle) neighAdd(neigh *Neigh, mode int) error {
120: req := h.newNetlinkRequest(unix.RTM_NEWNEIGH, mode|unix.NLM_F_ACK)
121: return neighHandle(neigh, req)
122: }
123:
124: // NeighDel will delete an IP address from a link device.
(dlv) c
> github.com/vishvananda/netlink.(*Handle).neighAdd() /go/pkg/mod/github.com/vishvananda/netlink@v1.1.1-0.20210703095558-21f2c55a7727/neigh_linux.go:119 (hits goroutine(788):3 total:3) (PC: 0x2297573)
114: return pkgHandle.neighAdd(neigh, mode)
115: }
116:
117: // NeighAppend will append an entry to FDB
118: // Equivalent to: `bridge fdb append...`
=> 119: func (h *Handle) neighAdd(neigh *Neigh, mode int) error {
120: req := h.newNetlinkRequest(unix.RTM_NEWNEIGH, mode|unix.NLM_F_ACK)
121: return neighHandle(neigh, req)
122: }
123:
124: // NeighDel will delete an IP address from a link device.
(dlv) c
> github.com/vishvananda/netlink.(*Handle).neighAdd() /go/pkg/mod/github.com/vishvananda/netlink@v1.1.1-0.20210703095558-21f2c55a7727/neigh_linux.go:119 (hits goroutine(788):4 total:4) (PC: 0x2297573)
114: return pkgHandle.neighAdd(neigh, mode)
115: }
116:
117: // NeighAppend will append an entry to FDB
118: // Equivalent to: `bridge fdb append...`
=> 119: func (h *Handle) neighAdd(neigh *Neigh, mode int) error {
120: req := h.newNetlinkRequest(unix.RTM_NEWNEIGH, mode|unix.NLM_F_ACK)
121: return neighHandle(neigh, req)
122: }
123:
124: // NeighDel will delete an IP address from a link device.
(dlv) c
一些细节
- 在centos宿主机上执行使用go build 编译 calico-node,需要升级go,git,gcc(编译bpf需要高版本的gcc)等,这些软件都需要编译安装,centos没有现有可用的rpm包。
https://www.cnblogs.com/hi-blog/p/how-to-update-git-on-centos7.html。 - 编译过程需要联网下载好多东西,需要宿主机配置好代理。
mkdir /root/go/src/github.com/projectcalico/ -p
git clone https://github.com/projectcalico/node.git
git ls-remote --tags 获取远端tag号
git checkout -b v3.20.1 1cf251788746268ebd0ebb6bc5b55f1d7c682f81 - 在构建中也是需要下载好多文件,因为在docker run中也需要加上代理,修改Makefile,docker run 增加代理 -e http_proxy=http://9.9.9.134:8118 -e https_proxy=http://9.9.9.134:8118
注释掉获取Makefile.common,不需要重复获取
4、使用make build LOCAL_BUILD=true,修改calico-node代码,可能需要同时修改felix,libcalico等,所以需要加上LOCAL_BUILD来进行go.mod中的依赖替换。
// 编译时将变量进行赋值,这个技巧挺好
LDFLAGS=-ldflags "\
-X $(PACKAGE_NAME)/pkg/lifecycle/startup.VERSION=$(GIT_VERSION) \
-X $(PACKAGE_NAME)/buildinfo.GitVersion=$(GIT_DESCRIPTION) \
-X $(PACKAGE_NAME)/buildinfo.BuildDate=$(DATE) \
-X $(PACKAGE_NAME)/buildinfo.GitRevision=$(GIT_COMMIT)"
内网编译修改项
calico-node的整个编译和构建镜像的过程还是挺复杂,如果需要在内网进行编译的话,需要适配修改的内容还是挺多的。
- 提前下载编译使用的相关镜像。
- 提前使用go mod下载好依赖。
- 提前准备好Makefile中需要下载的文件。
- Dockerfile.amd64分两部分,写一个Dockerfile将前一部分制作成一个基础镜像,例如叫calico/node:base,避免重复编译,提升效率。
- Dockerfile.amd64第二部分再写一个Dockerfile,只拷贝calico-node二进制到容器内即可。
- 修改Makefile和Dockerfile进行构建。
- 修改gomodcache的路径,映射到容器内。
第二部分的Dockerfile
# Copy everything into a fresh scratch image so that naive CVE scanners don't pick up binaries and libraries
# that have been removed in our later layers.
FROM calico/node:base as calico_base
FROM scratch
COPY --from=calico_base / /
# Add in top-level license file
COPY LICENSE /licenses
# Copy in the calico-node binary
COPY dist/bin/calico-node-amd64 /bin/calico-node
CMD ["start_runit"]
# Required labels for certification
LABEL name="Calico node" \
vendor="Project Calico" \
version=$GIT_VERSION \
release="1" \
summary="Calico node handles networking and policy for Calico" \
description="Calico node handles networking and policy for Calico" \
maintainer="laurence@tigera.io"
# Tell sv where to find the services.
ENV SVDIR=/etc/service/enabled