修改前,未对传进来的timeout参数进行处理,通过time.Sleep(100 * time.Millisecond)的方式,隔一段时间执行一次getExecConfig,用running参数来判断exec是否结束。
// ExecSync executes a command in the container, and returns the stdout output.
// If command exits with a non-zero exit code, an error is returned.
func (c *CriManager) ExecSync(ctx context.Context, r *runtime.ExecSyncRequest) (*runtime.ExecSyncResponse, error) {
// TODO: handle timeout.
id := r.GetContainerId()
createConfig := &apitypes.ExecCreateConfig{
Cmd: r.GetCmd(),
}
execid, err := c.ContainerMgr.CreateExec(ctx, id, createConfig)
if err != nil {
return nil, fmt.Errorf("failed to create exec for container %q: %v", id, err)
}
var output bytes.Buffer
startConfig := &apitypes.ExecStartConfig{}
attachConfig := &AttachConfig{
Stdout:true,
Stderr:true,
MemBuffer: &output,
}
err = c.ContainerMgr.StartExec(ctx, execid, startConfig, attachConfig)
if err != nil {
return nil, fmt.Errorf("failed to start exec for container %q: %v", id, err)
}
var execConfig *ContainerExecConfig
for {
execConfig, err = c.ContainerMgr.GetExecConfig(ctx, execid)
if err != nil {
return nil, fmt.Errorf("failed to inspect exec for container %q: %v", id, err)
}
// Loop until exec finished.
if !execConfig.Running {
break
}
time.Sleep(100 * time.Millisecond)
}
var stderr []byte
if execConfig.Error != nil {
stderr = []byte(execConfig.Error.Error())
}
return &runtime.ExecSyncResponse{
Stdout: output.Bytes(),
Stderr: stderr,
ExitCode: int32(execConfig.ExitCode),
}, nil
}
修改后,把AttachConfig结构里的membuffer改成了pipe,pipe为io.pipewriter类型。
通过io.copy来判断StartExec是否执行完成。这样子,相当于把管道的一头交给 containerio 来操作了,剩下的read操作,就在 ExecSync 里读出来,放到一个 buffer 里。而io.copy返回有两种情况:一是遇到error直接返回;二是读到eof,也会返回,但是此时的返回值为nil。
这样就完美解决了频繁的ping pouchd的问题!但是这个pr merge后,cri test开始频发flaky test,还是一些与代码无关的文档pr。。
• Failure [0.653 seconds]
[k8s.io] Security Context
/home/travis/gopath/src/github.com/kubernetes-incubator/cri-tools/pkg/framework/framework.go:72
SeccompProfilePath
/home/travis/gopath/src/github.com/kubernetes-incubator/cri-tools/pkg/validate/security_context.go:411
runtime should support an seccomp profile that blocks setting hostname with SYS_ADMIN [It]
/home/travis/gopath/src/github.com/kubernetes-incubator/cri-tools/pkg/validate/security_context.go:517
cmd [hostname ANewHostName], stdout "hostname: sethostname: Operation not permitted\n", stderr ""
Expected an error to have occurred. Got:
: nil
/home/travis/gopath/src/github.com/kubernetes-incubator/cri-tools/pkg/validate/security_context.go:1046
错误的源头不在cri部分,而是在pouchd部分。execSync方法中,判断IO是否完成的流程如下:
因此在pouchd的execExitedAndRelease方法里,把IO关闭的部分,移到execConfig update后面就好了。