Go 中一个非典型不加锁读写变量案例分析

前段时间在 v2 看到一个关于并发读写变量的问题:go 一个线程写, 另外一个线程读, 为什么不能保证最终一致性。帖子中给出的例子非常简单(稍作修改)main.go

package main

import (
    "fmt"
    "runtime"
    "time"
)

var i = 0

func main() {
    runtime.GOMAXPROCS(2)
    go func() {
        for {
            fmt.Println("i am here", i)
            time.Sleep(time.Second)
        }
    }()
    for {
        i += 1
    }
}

既然是问题贴,直接运行的结果应该是出乎大多数人预料的:

╰─➤  go run main.go                                                                                                                                     1 ↵
i am here 0
i am here 0
i am here 0
i am here 0
i am here 0
i am here 0
...

帖子的回复比较多,涉及的信息量相对杂乱,爬完楼反而感觉没有看懂。这里就不卖关子,直接给出脱水后的结论:出现上面结果的原因是 go 的编译器把代码 i 自加 1 的 for 循环优化掉了。要验证这一点也很简单,我们使用 go tool objdump -s 'main\.main' main 查看编译出的二进制可执行文件的汇编代码:

╰─➤  go tool objdump -s 'main\.main' main
TEXT main.main(SB) /Users/liudanking/code/golang/gopath/src/test/main.go
  main.go:11        0x108de60       65488b0c25a0080000  MOVQ GS:0x8a0, CX
  main.go:11        0x108de69       483b6110        CMPQ 0x10(CX), SP
  main.go:11        0x108de6d       7635            JBE 0x108dea4
  main.go:11        0x108de6f       4883ec18        SUBQ $0x18, SP
  main.go:11        0x108de73       48896c2410      MOVQ BP, 0x10(SP)
  main.go:11        0x108de78       488d6c2410      LEAQ 0x10(SP), BP
  main.go:12        0x108de7d       48c7042402000000    MOVQ $0x2, 0(SP)
  main.go:12        0x108de85       e8366bf7ff      CALL runtime.GOMAXPROCS(SB)
  main.go:13        0x108de8a       c7042400000000      MOVL $0x0, 0(SP)
  main.go:13        0x108de91       488d05187f0300      LEAQ go.func.*+115(SB), AX
  main.go:13        0x108de98       4889442408      MOVQ AX, 0x8(SP)
  main.go:13        0x108de9d       e8fe13faff      CALL runtime.newproc(SB)
  main.go:20        0x108dea2       ebfe            JMP 0x108dea2
  main.go:11        0x108dea4       e8c7dffbff      CALL runtime.morestack_noctxt(SB)
  main.go:11        0x108dea9       ebb5            JMP main.main(SB)
  :-1           0x108deab       cc          INT $0x3
  :-1           0x108deac       cc          INT $0x3
  :-1           0x108dead       cc          INT $0x3
  :-1           0x108deae       cc          INT $0x3
  :-1           0x108deaf       cc          INT $0x3

TEXT main.main.func1(SB) /Users/liudanking/code/golang/gopath/src/test/main.go
  main.go:13        0x108deb0       65488b0c25a0080000  MOVQ GS:0x8a0, CX
  main.go:13        0x108deb9       483b6110        CMPQ 0x10(CX), SP
  main.go:13        0x108debd       0f8695000000        JBE 0x108df58
  main.go:13        0x108dec3       4883ec58        SUBQ $0x58, SP
  main.go:13        0x108dec7       48896c2450      MOVQ BP, 0x50(SP)
  main.go:13        0x108decc       488d6c2450      LEAQ 0x50(SP), BP
  main.go:15        0x108ded1       0f57c0          XORPS X0, X0
  main.go:15        0x108ded4       0f11442430      MOVUPS X0, 0x30(SP)
  main.go:15        0x108ded9       0f11442440      MOVUPS X0, 0x40(SP)
  main.go:15        0x108dede       488d059b020100      LEAQ runtime.types+65664(SB), AX
  main.go:15        0x108dee5       4889442430      MOVQ AX, 0x30(SP)
  main.go:15        0x108deea       488d0d0f2d0400      LEAQ main.statictmp_0(SB), CX
  main.go:15        0x108def1       48894c2438      MOVQ CX, 0x38(SP)
  main.go:15        0x108def6       488d1583fb0000      LEAQ runtime.types+63872(SB), DX
  main.go:15        0x108defd       48891424        MOVQ DX, 0(SP)
  main.go:15        0x108df01       488d1d107c0c00      LEAQ main.i(SB), BX
  main.go:15        0x108df08       48895c2408      MOVQ BX, 0x8(SP)
  main.go:15        0x108df0d       e84eddf7ff      CALL runtime.convT2E64(SB)
  main.go:15        0x108df12       488b442410      MOVQ 0x10(SP), AX
  main.go:15        0x108df17       488b4c2418      MOVQ 0x18(SP), CX
  main.go:15        0x108df1c       4889442440      MOVQ AX, 0x40(SP)
  main.go:15        0x108df21       48894c2448      MOVQ CX, 0x48(SP)
  main.go:15        0x108df26       488d442430      LEAQ 0x30(SP), AX
  main.go:15        0x108df2b       48890424        MOVQ AX, 0(SP)
  main.go:15        0x108df2f       48c744240802000000  MOVQ $0x2, 0x8(SP)
  main.go:15        0x108df38       48c744241002000000  MOVQ $0x2, 0x10(SP)
  main.go:15        0x108df41       e85a9dffff      CALL fmt.Println(SB)
  main.go:16        0x108df46       48c7042400ca9a3b    MOVQ $0x3b9aca00, 0(SP)
  main.go:16        0x108df4e       e87d27fbff      CALL time.Sleep(SB)
  main.go:15        0x108df53       e979ffffff      JMP 0x108ded1
  main.go:13        0x108df58       e813dffbff      CALL runtime.morestack_noctxt(SB)
  main.go:13        0x108df5d       e94effffff      JMP main.main.func1(SB)
  :-1           0x108df62       cc          INT $0x3
  :-1           0x108df63       cc          INT $0x3
  :-1           0x108df64       cc          INT $0x3
  :-1           0x108df65       cc          INT $0x3
  :-1           0x108df66       cc          INT $0x3
  :-1           0x108df67       cc          INT $0x3
  :-1           0x108df68       cc          INT $0x3
  :-1           0x108df69       cc          INT $0x3
  :-1           0x108df6a       cc          INT $0x3
  :-1           0x108df6b       cc          INT $0x3
  :-1           0x108df6c       cc          INT $0x3
  :-1           0x108df6d       cc          INT $0x3
  :-1           0x108df6e       cc          INT $0x3
  :-1           0x108df6f       cc          INT $0x3

显然,

    for {
        i += 1
    }

直接被优化没了。我们可以在语句 i += 1 添加一个其他语句来避免被优化掉:

    for {
        i += 1
        time.Sleep(time.Nanosecond)
    }

重新运行程序,运行结果“看似正确”了:

╰─➤  go run main.go                                                                                                                                     1 ↵
i am here 30
i am here 1806937
i am here 3853635
i am here 5485251
...

显然,如此修改之后,这段代码并非真正正确。因为变量 i 存在并发读写,即 data race 的问题。而 data race 场景下,go 的行为是未知的。程序员最讨厌的几件事中,不确定性必居其一。因此,一步小心写出 data race 的bug,调试起来是不太开心的。这里的例子因为只有几行代码,我们可以目测定位问题。如果代码规模比较大,我们可以借助 golang 工具链中的 -race 参数来排查该类问题:

╰─➤  go run -race main.go                                                                                                                               2 ↵
==================
WARNING: DATA RACE
Read at 0x0000011d4318 by goroutine 6:
  runtime.convT2E64()
      /usr/local/go/src/runtime/iface.go:335 +0x0
  main.main.func1()
      /Users/liudanking/code/golang/gopath/src/test/main.go:15 +0x7d

Previous write at 0x0000011d4318 by main goroutine:
  main.main()
      /Users/liudanking/code/golang/gopath/src/test/main.go:20 +0x7f

Goroutine 6 (running) created at:
  main.main()
      /Users/liudanking/code/golang/gopath/src/test/main.go:13 +0x53
==================
i am here 1
i am here 558324
i am here 1075838

除了在 go run 上可以使用 -trace, 其他几个常用的golang工具链指令也支持这个参数:

$ go test -race mypkg    // to test the package
$ go run -race mysrc.go  // to run the source file
$ go build -race mycmd   // to build the command
$ go install -race mypkg // to install the package

需要说明的是, -trace 并不保证能够检查出程序中所有的 data race, 而检查出 data race 则必然存在。说起来比较绕,大家记住它跟布隆过滤器 (Bloom Filter) 的真值表是一样的就对了。

而要把最开始提到的代码改对,方法有很多,我们可以使用 The Go Memory Model 推荐的 sync 包中的读写锁即可:

package main

import (
    "fmt"
    "runtime"
    "sync"
    "time"
)

var i = 0

func main() {
    runtime.GOMAXPROCS(2)
    mtx := sync.RWMutex{}
    go func() {
        for {
            mtx.RLock()
            fmt.Println("i am here", i)
            mtx.RUnlock()
            time.Sleep(time.Second)
        }
    }()
    for {
        mtx.Lock()
        i += 1
        mtx.Unlock()
        time.Sleep(time.Nanosecond)
    }

扩展阅读