本文基于golang 1.13版本分析。

一、前言

1.1 slice 结构

slice实际就是一个struct，在runtime/slice.go中的定义如下：

type slice struct {
    array unsafe.Pointer
    len   int 
    cap   int 
}

// An notInHeapSlice is a slice backed by go:notinheap memory.
type notInHeapSlice struct {
    array *notInHeap
    len   int 
    cap   int 
}

由定义可以看出slice底层是基于数组，本质是对数组的封装。由三部分组成：

指针，指向第一个slice元素对应的底层数组元素地址。
长度，表示切片可用元素的个数，也就是说使用下标对 slice 的元素进行访问时，下标不能超过 slice 的长度；
容量，底层数组的元素个数，容量 >= 长度。在底层数组不进行扩容的情况下，容量也是 slice 可以扩张的最大限度。

slice

内置函数len和cap，分别返回slice的长度和容量。slice使用下标不能超过len，向后扩展不能超过cap。多个不同slice之间可以共享底层的数据，起始地址、长度都可以不同，所以slice第一个元素未必是数组的第一个元素。

slice

1.2 使用中的坑

介绍完slice的数据结构，作者想把可能遇到的坑写在最前，避免后来者持续踩坑。

1.2.1 slice作为函数入参

函数内append slice，修改的仅仅是函数中slice的len，外面的slice len不会发生变化。

对切片进行append的时候，如果底层空间足够就使用原来的空间，如果底层空间不够，那么就会申请新的空间。函数传递切片时，是值传递，不是引用传递，传递的是slice结构体那三个字段的值，所以不会复制slice的实际内容，在函数内append时，修改的仅仅是函数中slice的len／cap，外面的slice len／cap不会发生变化。

1.2.2 slice初始化make与append使用

使用make([]int,len(m))初始化时，已经赋值零值，万万不能配合append使用。

func main()  {
    m := map[int]int{1:10,2:20}
    //=======错误的做法=============
    a := make([]int,len(m))
    for _,v := range m{
        a = append(a,v)
    }
    // [0 0 10 20]
    fmt.Println(a)

    //=======正确的做法=============
    b := make([]int,0,len(m))
    for _,v := range m{
        b = append(b,v)
    }
    //[10 20]
    fmt.Println(b)

    c := make([]int,len(m))
    i := 0
    for _,v := range m{
        c[i] = v
        i++
    }
    //[10 20]
    fmt.Println(c)

    var d []int
    for _,v := range m{
        d = append(d,v)
    }
    //[10 20]
    fmt.Println(d)
}

1.2.3 slice nil值与空值

slice nil值与空值不相等

var s []int          //nil值   
var t = []int{}     //空值
var u = make([]int, 3)[3:]   //空值

fmt.Printf("value of s: %#v\n", s) // value of s: []int(nil)
fmt.Printf("value of t: %#v\n", t)   // value of t: []int{}
fmt.Printf("value of u: %#v\n", u) //value of u: []int{}
fmt.Printf("s is nil? %v\n", s == nil)   //true
fmt.Printf("t is nil? %v\n", t == nil)     //false
fmt.Printf("u is nil? %v\n", u == nil)     //false

区别是，nil slice的底层数组指针是nil，empty slice底层数组指针指向一个长度为0的数组。
所以测试一个slice是否有数据，使用len(s) == 0来判断，而不应用s == nil来判断。
一般的用法是nil slice表示数组不存在，empty slice表示集合为空。序列化json的时候，nil slice会变成null，empty是[]

slice nil & 空值

二、源码解析

2.1 slice 创建

创建 slice 的方式有以下几种：

slice 创建

2.1.1 make

func makeslice(et *_type, len, cap int) unsafe.Pointer {
    mem, overflow := math.MulUintptr(et.size, uintptr(cap))
    if overflow || mem > maxAlloc || len < 0 || len > cap {
        // NOTE: Produce a 'len out of range' error instead of a
        // 'cap out of range' error when someone does make([]T, bignumber).
        // 'cap out of range' is true too, but since the cap is only being
        // supplied implicitly, saying len is clearer.
        // See golang.org/issue/4085.
        mem, overflow := math.MulUintptr(et.size, uintptr(len))
        if overflow || mem > maxAlloc || len < 0 {
            panicmakeslicelen()
        }
        panicmakeslicecap()
    }

    return mallocgc(mem, et, true)
}

先来一小段玩具代码，使用 make 关键字创建 slice：

package main

import "fmt"

func main() {
    slice := make([]int, 5, 10) // 长度为5，容量为10
    slice[2] = 2 // 索引为2的元素赋值为2
    fmt.Println(slice)
}

执行如下命令，得到 Go 汇编代码：

go tool compile -S main.go

我们只关注main函数：

0x0000 00000 (main.go:5)TEXT    "".main(SB), $96-0
0x0000 00000 (main.go:5)MOVQ    (TLS), CX
0x0009 00009 (main.go:5)CMPQ    SP, 16(CX)
0x000d 00013 (main.go:5)JLS     228
0x0013 00019 (main.go:5)SUBQ    $96, SP
0x0017 00023 (main.go:5)MOVQ    BP, 88(SP)
0x001c 00028 (main.go:5)LEAQ    88(SP), BP
0x0021 00033 (main.go:5)FUNCDATA    $0, gclocals·69c1753bd5f81501d95132d08af04464(SB)
0x0021 00033 (main.go:5)FUNCDATA    $1, gclocals·57cc5e9a024203768cbab1c731570886(SB)
0x0021 00033 (main.go:5)LEAQ    type.int(SB), AX
0x0028 00040 (main.go:6)MOVQ    AX, (SP)
0x002c 00044 (main.go:6)MOVQ    $5, 8(SP)
0x0035 00053 (main.go:6)MOVQ    $10, 16(SP)
0x003e 00062 (main.go:6)PCDATA  $0, $0
0x003e 00062 (main.go:6)CALL    runtime.makeslice(SB)
0x0043 00067 (main.go:6)MOVQ    24(SP), AX
0x0048 00072 (main.go:6)MOVQ    32(SP), CX
0x004d 00077 (main.go:6)MOVQ    40(SP), DX
0x0052 00082 (main.go:7)CMPQ    CX, $2
0x0056 00086 (main.go:7)JLS     221
0x005c 00092 (main.go:7)MOVQ    $2, 16(AX)
0x0064 00100 (main.go:8)MOVQ    AX, ""..autotmp_2+64(SP)
0x0069 00105 (main.go:8)MOVQ    CX, ""..autotmp_2+72(SP)
0x006e 00110 (main.go:8)MOVQ    DX, ""..autotmp_2+80(SP)
0x0073 00115 (main.go:8)MOVQ    $0, ""..autotmp_1+48(SP)
0x007c 00124 (main.go:8)MOVQ    $0, ""..autotmp_1+56(SP)
0x0085 00133 (main.go:8)LEAQ    type.[]int(SB), AX
0x008c 00140 (main.go:8)MOVQ    AX, (SP)
0x0090 00144 (main.go:8)LEAQ    ""..autotmp_2+64(SP), AX
0x0095 00149 (main.go:8)MOVQ    AX, 8(SP)
0x009a 00154 (main.go:8)PCDATA  $0, $1
0x009a 00154 (main.go:8)CALL    runtime.convT2Eslice(SB)
0x009f 00159 (main.go:8)MOVQ    16(SP), AX
0x00a4 00164 (main.go:8)MOVQ    24(SP), CX
0x00a9 00169 (main.go:8)MOVQ    AX, ""..autotmp_1+48(SP)
0x00ae 00174 (main.go:8)MOVQ    CX, ""..autotmp_1+56(SP)
0x00b3 00179 (main.go:8)LEAQ    ""..autotmp_1+48(SP), AX
0x00b8 00184 (main.go:8)MOVQ    AX, (SP)
0x00bc 00188 (main.go:8)MOVQ    $1, 8(SP)
0x00c5 00197 (main.go:8)MOVQ    $1, 16(SP)
0x00ce 00206 (main.go:8)PCDATA  $0, $1
0x00ce 00206 (main.go:8)CALL    fmt.Println(SB)
0x00d3 00211 (main.go:9)MOVQ    88(SP), BP
0x00d8 00216 (main.go:9)ADDQ    $96, SP
0x00dc 00220 (main.go:9)RET
0x00dd 00221 (main.go:7)PCDATA  $0, $0
0x00dd 00221 (main.go:7)CALL    runtime.panicindex(SB)
0x00e2 00226 (main.go:7)UNDEF
0x00e4 00228 (main.go:7)NOP
0x00e4 00228 (main.go:5)PCDATA  $0, $-1
0x00e4 00228 (main.go:5)CALL    runtime.morestack_noctxt(SB)
0x00e9 00233 (main.go:5)JMP     0

我们先从上到下扫一眼，看到几个关键函数：

CALL    runtime.makeslice(SB)
CALL    runtime.convT2Eslice(SB)
CALL    fmt.Println(SB)
CALL    runtime.morestack_noctxt(SB)

make逻辑

1是创建 slice 相关的；
2是类型转换；调用 fmt.Println需要将 slice 作一个转换；
3是打印语句；
4是栈空间扩容函数，在函数开始处，会检查当前栈空间是否足够，不够的话需要调用它来进行扩容。
接下来，我们详细分析这整个过程。

左边是栈上的数据，右边是堆上的数据。array 指向 slice 的底层数据，被分配到堆上了。注意，栈上的地址是从高向低增长；堆则从低向高增长。栈左边的数字表示对应的汇编代码的行数，栈右边箭头则表示栈地址。（48）SP、（56）SP 表示的内容接着往下看。

注意，在图中，栈地址是从下往上增长，所以 SP 表示的是图中 *_type 所在的位置，其它的依此类推

convT2Eslice 的函数声明如下：

func convT2Eslice(t *_type, elem unsafe.Pointer) (e eface)

第一个参数是指针 *_type，_type是一个表示类型的结构体，这里传入的就是 slice的类型 []int；第二个参数则是元素的指针，这里传入的就是 slice 底层数组的首地址。

返回值 eface 的结构体定义如下：

type eface struct {
    _type *_type
    data  unsafe.Pointer
}

由于我们会调用 fmt.Println(slice)，看下函数原型：

func Println(a ...interface{}) (n int, err error)

Println 接收 interface 类型，因此我们需要将 slice 转换成 interface 类型。由于 slice 没有方法，是个“空 interface”。因此会调用 convT2Eslice 完成这一转换过程。

convT2Eslice 函数返回的是类型指针和数据地址。源码就不贴了，大体流程是：调用 mallocgc 分配一块内存，把数据 copy 进到新的内存，然后返回这块内存的地址，*_type 则直接返回传入的参数。

32(SP) 和 40(SP) 其实是 makeslice 函数的返回值，这里可以忽略。

还剩 fmt.Println(slice) 最后一个函数调用了，我们继续。

所以调用 fmt.Println(slice) 时，实际是传入了一个 slice类型的eface地址。这样，Println就可以访问类型中的数据，最终给“打印”出来。

最后，我们看下 main 函数栈帧的开始和收尾部分。

0x0013 00019 (main.go:5)SUBQ    $96, SP
0x0017 00023 (main.go:5)MOVQ    BP, 88(SP)
0x001c 00028 (main.go:5)LEAQ    88(SP), BP
…………………………
0x00d3 00211 (main.go:9)MOVQ    88(SP), BP
0x00d8 00216 (main.go:9)ADDQ    $96, SP
RET

BP可以理解为保存了当前函数栈帧栈底的地址，SP则保存栈顶的地址。

初始，BP 和 SP 分别有一个初始状态。

main 函数执行的时候，先根据 main 函数栈帧大小确定 SP 的新指向，使得 main 函数栈帧大小达到 96B。之后把老的 BP 保存到 main 函数栈帧的底部，并使 BP 寄存器重新指向新的栈底，也就是 main 函数栈帧的栈底。

最后，当 main 函数执行完毕，把它栈底的 BP 给回弹回到 BP 寄存器，恢复调用前的初始状态。一切都像是没有发生一样，完美的现场。

2.2 slice 扩容

// growslice handles slice growth during append.
// It is passed the slice element type, the old slice, and the desired new minimum capacity,
// and it returns a new slice with at least that capacity, with the old data
// copied into it.
// The new slice's length is set to the old slice's length,
// NOT to the new requested capacity.
// This is for codegen convenience. The old slice's length is used immediately
// to calculate where to write new values during an append.
// TODO: When the old backend is gone, reconsider this decision.
// The SSA backend might prefer the new length or to return only ptr/cap and save stack space.
func growslice(et *_type, old slice, cap int) slice {
    if raceenabled {
        callerpc := getcallerpc()
        racereadrangepc(old.array, uintptr(old.len*int(et.size)), callerpc, funcPC(growslice))
    }
    if msanenabled {
        msanread(old.array, uintptr(old.len*int(et.size)))
    }

    if cap < old.cap {
        panic(errorString("growslice: cap out of range"))
    }

    if et.size == 0 {
        // append should not create a slice with nil pointer but non-zero len.
        // We assume that append doesn't need to preserve old.array in this case.
        return slice{unsafe.Pointer(&zerobase), old.len, cap}
    }

    newcap := old.cap
    doublecap := newcap + newcap
    if cap > doublecap {
        newcap = cap
    } else {
        if old.len < 1024 {
            newcap = doublecap
        } else {
            // Check 0 < newcap to detect overflow
            // and prevent an infinite loop.
            for 0 < newcap && newcap < cap {
                newcap += newcap / 4
            }
            // Set newcap to the requested cap when
            // the newcap calculation overflowed.
            if newcap <= 0 {
                newcap = cap
            }
        }
    }

    var overflow bool
    var lenmem, newlenmem, capmem uintptr
    // Specialize for common values of et.size.
    // For 1 we don't need any division/multiplication.
    // For sys.PtrSize, compiler will optimize division/multiplication into a shift by a constant.
    // For powers of 2, use a variable shift.
    switch {
    case et.size == 1:
        lenmem = uintptr(old.len)
        newlenmem = uintptr(cap)
        capmem = roundupsize(uintptr(newcap))
        overflow = uintptr(newcap) > maxAlloc
        newcap = int(capmem)
    case et.size == sys.PtrSize:
        lenmem = uintptr(old.len) * sys.PtrSize
        newlenmem = uintptr(cap) * sys.PtrSize
        capmem = roundupsize(uintptr(newcap) * sys.PtrSize)
        overflow = uintptr(newcap) > maxAlloc/sys.PtrSize
        newcap = int(capmem / sys.PtrSize)
    case isPowerOfTwo(et.size):
        var shift uintptr
        if sys.PtrSize == 8 {
            // Mask shift for better code generation.
            shift = uintptr(sys.Ctz64(uint64(et.size))) & 63
        } else {
            shift = uintptr(sys.Ctz32(uint32(et.size))) & 31
        }
        lenmem = uintptr(old.len) << shift
        newlenmem = uintptr(cap) << shift
        capmem = roundupsize(uintptr(newcap) << shift)
        overflow = uintptr(newcap) > (maxAlloc >> shift)
        newcap = int(capmem >> shift)
    default:
        lenmem = uintptr(old.len) * et.size
        newlenmem = uintptr(cap) * et.size
        capmem, overflow = math.MulUintptr(et.size, uintptr(newcap))
        capmem = roundupsize(capmem)
        newcap = int(capmem / et.size)
    }

    // The check of overflow in addition to capmem > maxAlloc is needed
    // to prevent an overflow which can be used to trigger a segfault
    // on 32bit architectures with this example program:
    //
    // type T [1<<27 + 1]int64
    //
    // var d T
    // var s []T
    //
    // func main() {
    //   s = append(s, d, d, d, d)
    //   print(len(s), "\n")
    // }
    if overflow || capmem > maxAlloc {
        panic(errorString("growslice: cap out of range"))
    }

    var p unsafe.Pointer
    if et.ptrdata == 0 {
        p = mallocgc(capmem, nil, false)
        // The append() that calls growslice is going to overwrite from old.len to cap (which will be the new length).
        // Only clear the part that will not be overwritten.
        memclrNoHeapPointers(add(p, newlenmem), capmem-newlenmem)
    } else {
        // Note: can't use rawmem (which avoids zeroing of memory), because then GC can scan uninitialized memory.
        p = mallocgc(capmem, et, true)
        if lenmem > 0 && writeBarrier.enabled {
            // Only shade the pointers in old.array since we know the destination slice p
            // only contains nil pointers because it has been cleared during alloc.
            bulkBarrierPreWriteSrcOnly(uintptr(p), uintptr(old.array), lenmem)
        }
    }
    memmove(p, old.array, lenmem)

    return slice{p, old.len, newcap}
}

网上大多数的文章都是这样描述的：

当原 slice 容量小于 1024 的时候，新 slice 容量变成原来的 2 倍；原 slice 容量超过 1024，新 slice 容量变成原来的1.25倍。

我在这里先说结论：以上描述是错误的。

比如下面的代码，容量各自是多少呢？

package main

import "fmt"

func main() {
    a := []byte{1, 0}
    a = append(a, 1, 1, 1)
    fmt.Println("cap of a is ",cap(a))
    
    b := []int{23, 51}
    b = append(b, 4, 5, 6)
    fmt.Println("cap of b is ",cap(b))
    
    c := []int32{1, 23}
    c = append(c, 2, 5, 6)
    fmt.Println("cap of c is ",cap(c))

    type D struct{
        age byte
        name string

    }
    d := []D{
        {1,"123"},
        {2,"234"},
    }

    d = append(d,D{4,"456"},D{5,"567"},D{6,"678"})
    fmt.Println("cap of d is ",cap(d))
}

应该是4个8？基于翻倍的思路，cap从2->4->8。
或者4个5？给4个5的猜测基于以下推测：如果在append多个元素的时候，一次扩容不足以满足元素的放置，如果我是设计者，我会先预估好需要多少容量才可以放置元素，然后再进行一次扩容，好处就是，不需要频繁申请新的底层数组，以及不需要频繁的数据copy。
但是结果有点出人意料。

cap of a is  8
cap of b is  6
cap of c is  8
cap of d is  5

在发生append那一行代码打上断点，然后开始运行程序，为了比较好的说明情况，断点打到扩容后容量为6的[]int型切片b的append上。

a. 传进来的cap是5，也就是上文提及到的思路目前来看是正确的，当append多个元素的时候，先预估好容量再进行扩容。
b. slice是一个struct，而struct是值类型。
c. 用capmem进行内存分配，然后将newcap作为新的slice的cap，我们来分析这一步capmem = roundupsize(uintptr(newcap) * sys.PtrSize)。
round-up，向上取整，roundupsize，向上取一个size。(uintptr(newcap) * sys.PtrSize)的乘积应该为5*8=40，经过向上取整之后得到了新的所需内存capmem=48，接着所需内存/类型大小int(capmem / sys.PtrSize)，得到了新的容量，也就是6.

2.2.1 slice 内存对齐

要明白roundupsize为什么会将40变为48，这里需要简单的引进go的内存管理。
对象大小表，大体如下：

// class  bytes/obj  bytes/span  objects  tail waste  max waste
//     1          8        8192     1024           0     87.50%
//     2         16        8192      512           0     43.75%
//     3         32        8192      256           0     46.88%
//     4         48        8192      170          32     31.52%
//     5         64        8192      128           0     23.44%
//     6         80        8192      102          32     19.07%
//     7         96        8192       85          32     15.95%
//     8        112        8192       73          16     13.56%
//     9        128        8192       64           0     11.72%
//    10        144        8192       56         128     11.82%

//    ...
//    65      28672       57344        2           0      4.91%
//    66      32768       32768        1           0     12.50%

最小是8b，最大是32K，还有一类就是超出32K的，共67类（超出32K没列在这个文件的，66+1=67）。可以看到，并没有size为40的类型，于是40向上取整，取到了48，这就是发生在roundupsize的真相。这里有一个比较专业的名词，内存对齐。

三、并发

由于 slice/map 是引用类型，golang函数是传值调用，所用参数副本依然是原来的 slice，并发访问同一个资源会导致竟态条件。

3.1 解决方案

3.1.1 方案 1: 加锁

func main() {
    slc := make([]int, 0, 1000)
    var wg sync.WaitGroup
    var lock sync.Mutex

    for i := 0; i < 1000; i++ {
        wg.Add(1)
        go func(a int) {
            defer wg.Done()
      // 加锁
            lock.Lock()
            defer lock.Unlock()
            slc = append(slc, a)
        }(i)
    
    }
   wg.Wait()
    fmt.Println(len(slc))
}

优点是比较简单，适合对性能要求不高的场景。

3.1.2 方案 2: 使用 channel 串行化操作

type ServiceData struct {
    ch   chan int // 用来 同步的channel
    data []int    // 存储数据的slice
}

func (s *ServiceData) Schedule() {
    // 从 channel 接收数据
    for i := range s.ch {
        s.data = append(s.data, i)
    }
}

func (s *ServiceData) Close() {
    // 最后关闭 channel
    close(s.ch)
}

func (s *ServiceData) AddData(v int) {
    s.ch <- v // 发送数据到 channel
}

func NewScheduleJob(size int, done func()) *ServiceData {
    s := &ServiceData{
        ch:   make(chan int, size),
        data: make([]int, 0),
    }

    go func() {
        // 并发地 append 数据到 slice
        s.Schedule()
        done()
    }()

    return s
}

func main() {
    var (
        wg sync.WaitGroup
        n  = 1000
    )
    c := make(chan struct{})

    // new 了这个 job 后，该 job 就开始准备从 channel 接收数据了
    s := NewScheduleJob(n, func() { c <- struct{}{} })

    wg.Add(n)
    for i := 0; i < n; i++ {
        go func(v int) {
            defer wg.Done()
            s.AddData(v)

        }(i)
    }

    wg.Wait()
    s.Close()
    <-c

    fmt.Println(len(s.data))
}

实现相对复杂，优点是性能很好，利用了channel的优势。

go slice深度解析