Golang多级内存池设计与实现

上个月，牙膏厂intel因为Meltdown和Spectre两个bug需要给CPU固件和系统打了补丁。我们生产环境使用的是阿里云，打完补丁后，几台IO密集型的机器性能下降明显，从流量和cpu load估计，性能影响在50%左右，不是说好的最多下降30%麽?

在跑的业务是go写的，使用go pprof对程序profiling了一下，无意中发现，目前的系统gc和malloc偏高。其中ioutil.ReadAll占用了可观的CPU时间。

ioutil.ReadAll为什么慢？

这个函数的签名原型是func ReadAll(r io.Reader) ([]byte, error). 团队的小伙伴非常喜欢用这个函数，其中一个原因是这个函数可以将r中的数据一次性读完返回，不需要关心内存如何分配、如果分配的内存不够了，如何进行内存扩张等。作为一个util函数，这样设计是完全没问题的。但是，IO密集场景下，这个函数的开销就是你需要关心的了。这个函数实际调用realAll读取数据：

// readAll reads from r until an error or EOF and returns the data it read
// from the internal buffer allocated with a specified capacity.
func readAll(r io.Reader, capacity int64) (b []byte, err error) {
    buf := bytes.NewBuffer(make([]byte, 0, capacity))
    // If the buffer overflows, we will get bytes.ErrTooLarge.
    // Return that as an error. Any other panic remains.
    defer func() {
        e := recover()
        if e == nil {
            return
        }
        if panicErr, ok := e.(error); ok &amp;&amp; panicErr == bytes.ErrTooLarge {
            err = panicErr
        } else {
            panic(e)
        }
    }()
    _, err = buf.ReadFrom(r)
    return buf.Bytes(), err
}

// readAll reads from r until an error or EOF and returns the data it read

// from the internal buffer allocated with a specified capacity.

func readAll(r io.Reader, capacity int64) (b []byte, err error) {

buf := bytes.NewBuffer(make([]byte, 0, capacity))

// If the buffer overflows, we will get bytes.ErrTooLarge.

// Return that as an error. Any other panic remains.

defer func() {

e := recover()

if e == nil {

return

}

if panicErr, ok := e.(error); ok && panicErr == bytes.ErrTooLarge {

err = panicErr

} else {

panic(e)

}

}()

_, err = buf.ReadFrom(r)

return buf.Bytes(), err

}

其中，capacity是常量值512. realAll函数在调用buf.ReadFrom进行数据读取：

// ReadFrom reads data from r until EOF and appends it to the buffer, growing
// the buffer as needed. The return value n is the number of bytes read. Any
// error except io.EOF encountered during the read is also returned. If the
// buffer becomes too large, ReadFrom will panic with ErrTooLarge.
func (b *Buffer) ReadFrom(r io.Reader) (n int64, err error) {
    b.lastRead = opInvalid
    // If buffer is empty, reset to recover space.
    if b.off &gt;= len(b.buf) {
        b.Reset()
    }
    for {
        if free := cap(b.buf) - len(b.buf); free &lt; MinRead {
            // not enough space at end
            newBuf := b.buf
            if b.off+free &lt; MinRead {
                // not enough space using beginning of buffer;
                // double buffer capacity
                newBuf = makeSlice(2*cap(b.buf) + MinRead) // 1 扩张内存
            }
            copy(newBuf, b.buf[b.off:]) // 2 拷贝内容
            b.buf = newBuf[:len(b.buf)-b.off]
            b.off = 0
        }
        m, e := r.Read(b.buf[len(b.buf):cap(b.buf)])
        b.buf = b.buf[0 : len(b.buf)+m]
        n += int64(m)
        if e == io.EOF {
            break
        }
        if e != nil {
            return n, e
        }
    }
    return n, nil // err is EOF, so return nil explicitly
}

// ReadFrom reads data from r until EOF and appends it to the buffer, growing

// the buffer as needed. The return value n is the number of bytes read. Any

// error except io.EOF encountered during the read is also returned. If the

// buffer becomes too large, ReadFrom will panic with ErrTooLarge.

func (b *Buffer) ReadFrom(r io.Reader) (n int64, err error) {

b.lastRead = opInvalid

// If buffer is empty, reset to recover space.

if b.off >= len(b.buf) {

b.Reset()

}

for {

if free := cap(b.buf) - len(b.buf); free < MinRead {

// not enough space at end

newBuf := b.buf

if b.off+free < MinRead {

// not enough space using beginning of buffer;

// double buffer capacity

newBuf = makeSlice(2*cap(b.buf) + MinRead) // 1 扩张内存

}

copy(newBuf, b.buf[b.off:]) // 2 拷贝内容

b.buf = newBuf[:len(b.buf)-b.off]

b.off = 0

}

m, e := r.Read(b.buf[len(b.buf):cap(b.buf)])

b.buf = b.buf[0 : len(b.buf)+m]

n += int64(m)

if e == io.EOF {

break

}

if e != nil {

return n, e

}

return n, nil // err is EOF, so return nil explicitly

}

看到这里，原因就非常清楚了：如果要读取的数据大小超过了初始buf大小（默认初始大小为512 bytes）, 则会重新分配内存，并拷贝内容到新的buffer中。如果要读取的数据非常大，则会重复多次上述操作。那么优化的问题就转化为如何降低内存重分配和拷贝。

多级内存池的设计和实现

内存池被按照大小被分为多级。如上图所示，(0, 1024]使用level 0, (1024, 2048]使用level 1. 内存池分级有两个好处：
1. 可以灵活的规划不同级别内存池的总大小和item数量，适应不同业务。
2. 实现层面上，可以将一把内存池大锁拆分成多个小锁，减少锁争抢。
当已分配的内存池耗尽需要扩张时，一次性申请一大块内存，提高扩张效率。如level 0所示。
代码实现gmmpool，bench结果显示性能提高约19倍：

BenchmarkStdReadAll-4             200000          5969 ns/op
BenchmarkMultiLevelPool-4        5000000           311 ns/op

BenchmarkStdReadAll-4 200000 5969 ns/op

BenchmarkMultiLevelPool-4 5000000 311 ns/op

小结

对于频繁进行内存分配和释放的场景，使用内存池可以显著降低golang运行时的开销。同时也要注意，内存池的内存交给了用户管理，你需要小心检查是否存在内存泄露问题。如果你对性能要求没有这么苛刻，只是想复用一些小对象，那么我们推荐你使用标准库的sync.Pool.

另外，开头提到的阿里云性能问题，即使使用了内存池优化，结果还是非常悲剧。最后阿里云帮我们更换了没有打牙膏厂补丁的机器解决。是不是非常惊喜？?

–EOF–

版权声明
转载请注明出处，本文原始链接

行思錄 | Travel Coder

Arch, Coding, Life

Golang多级内存池设计与实现

Golang多级内存池设计与实现

ioutil.ReadAll为什么慢？

多级内存池的设计和实现

小结