Common mistakes with for loops in Go

·

4 min read

先來一段程式

func main() {
    done := make(chan bool)

    values := []string{"a", "b", "c"}
    for _, v := range values {
        /*
        在每次loop中,此程式碼會啟動一個Goroutine來執行匿名函數。
        該函數將當前的值v輸出出來,然後將true發送到done通道。
        */
        go func() {
            fmt.Println(v)
            done <- true
        }()
    }

    // wait for all goroutines to complete before exiting
    for _ = range values {
        <-done
    }
}

直覺上應該會輸出a,b,c。 但因為Gorouting可能在loop的下一次迭代之後才執行,所以v的值可能已經改變。因此,所有Goroutines可能都會輸出相同的值c,而不是按順序輸出每個值。

再者是因為Go的for loop那個v, 其實是共用變數, 所以記憶體位置也是一樣的, 讓我們驗證看看。

func main() {
    done := make(chan bool)

    values := []string{"a", "b", "c"}
    for _, v := range values {
        go func() {
            fmt.Printf("value=%s, addr=%p\n", v, &v)
            done <- true
        }()
    }

    // wait for all goroutines to complete before exiting
    for _ = range values {
        <-done
    }
}

/*
value=c, addr=0xc000014070
value=c, addr=0xc000014070
value=c, addr=0xc000014070
*/

解法有幾種, 第一種, 進到迭代時, 就把值給複製一份。

func main() {
    done := make(chan bool)

    values := []string{"a", "b", "c"}
    for _, v := range values {
        v1 := v
        go func() {
            fmt.Println(v1)
            done <- true
        }()
    }

    // wait for all goroutines to complete before exiting
    for _ = range values {
        <-done
    }
}

第二種, 進到closure時, 就把值給傳入closure做一份參數的複製。

func main() {
    done := make(chan bool)

    values := []string{"a", "b", "c"}
    for _, v := range values {
        go func(v string) {
            fmt.Println(v)
            done <- true
        }(v)
    }

    // wait for all goroutines to complete before exiting
    for _ = range values {
        <-done
    }
}

面試蠻常問的題目XD 但其實Go有提供tool來做靜態程式碼掃描, go vet

go tool vet提供這麼多面向的check, 其中loopclosure check references to loop variables from within nested functions就是能掃描出上述的議題。

To list the available checks, run "go tool vet help":
* asmdecl      report mismatches between assembly files and Go declarations
* assign       check for useless assignments
* atomic       check for common mistakes using the sync/atomic package
* bools        check for common mistakes involving boolean operators
* buildtag     check that +build tags are well-formed and correctly located
* cgocall      detect some violations of the cgo pointer passing rules
* composites   check for unkeyed composite literals
* copylocks    check for locks erroneously passed by value
* httpresponse check for mistakes using HTTP responses
* loopclosure  check references to loop variables from within nested functions
* lostcancel   check cancel func returned by context.WithCancel is called
* nilfunc      check for useless comparisons between functions and nil
* printf       check consistency of Printf format strings and arguments
* shift        check for shifts that equal or exceed the width of the integer
* slog         check for incorrect arguments to log/slog functions
* stdmethods   check signature of methods of well-known interfaces
* structtag    check that struct field tags conform to reflect.StructTag.Get
* tests        check for common mistaken usages of tests and examples
* unmarshal    report passing non-pointer or non-interface values to unmarshal
* unreachable  check for unreachable code
* unsafeptr    check for invalid conversions of uintptr to unsafe.Pointer
* unusedresult check for unused results of calls to some functions

能透過go tool bet help能看看使用方法。 讓我來透過go tool vet來掃描看看

go vet main.go
或者
go vet -loopclosure main.go

/*
# command-line-arguments
./main.go:11:38: loop variable v captured by func literal
./main.go:11:42: loop variable v captured by func literal
*/

看到loop variable v captured by func literal就是告訴你第11行的v這個loop共享變數,正在被closure function給使用。 當我們改成上面的幾個寫法後,再執行go vet就不會看到上述警告了。

好,上面的寫法還算好察覺到。 接著來個很不容易察覺到的寫法,也會踩到這地雷。 參考

package main

import "fmt"

func main() {
    var out []*int
    for i := 0; i < 3; i++ {
        out = append(out, &i)
    }
    fmt.Println("Values:", *out[0], *out[1], *out[2])
    fmt.Println("Addresses:", out[0], out[1], out[2])
}

/*
Values: 3 3 3
Addresses: 0xc0000120e8 0xc0000120e8 0xc0000120e
*/

咦, 這次沒closure function了還會這樣? 試試看go vet! go vet main.go

啥都沒跑出來, 咦? 明明結果不如預期,go vet卻覺得沒問題! (拉G go vet呸?)

其實這裡for loop每次迭代的i,也都是指向同一個共享變數; 然後我們還做死, 取址&i, 新增進去out這ptr slice中。 i在最後一次迭代時真正指向的值是3, 只是地址都是同一個, 所以最後輸出才都會是3

來看看publicly documented issue at Lets Encrypt 內討論的一段程式碼

// authz2ModelMapToPB converts a mapping of domain name to authz2Models into a
// protobuf authorizations map
func authz2ModelMapToPB(m map[string]authz2Model) (*sapb.Authorizations, error) {
    resp := &sapb.Authorizations{}
    for k, v := range m {
        // Make a copy of k because it will be reassigned with each loop.
        kCopy := k
        authzPB, err := modelToAuthzPB(&v)
        if err != nil {
            return nil, err
        }
        resp.Authz = append(resp.Authz, &sapb.Authorizations_MapElement{
            Domain: &kCopy,
            Authz: authzPB,
        })
    }
    return resp, nil
}

ln7做了一行k的copy, 因為如果他不在這裡做copy, 此時的k其實也是共享變數, 在ln13的地方用&k的話會發生上面不如預期的錯誤。

Go1.22終於要面對這常見的陷阱,Fixing For Loops in Go 1.22

但現在1.22還沒正式release阿? 不怕文章內有說1.21有提供這新功能的Preview 只要加上GOEXPERIMENT=loopvar

GOEXPERIMENT=loopvar  go run main.go

/*
Values: 0 1 2
Addresses: 0xc0000120e8 0xc000012110 0xc000012118
*/

完美!

總結一下, 在Go1.21之前的版本, 只要for loop有對迭代變數取址, 或者for+closure時, code review能相互提醒。 在CI或透過husky這git hook利用go vet掃出這些低級錯誤先。 但Go 1.22我覺得是很值得升級的版本,因為這地雷真的太常踩到了。

參考資料:

Fixing For Loops in Go 1.22

What happens with closures running as goroutines?

Let's Encrypt: CAA Rechecking bug

CommonMistakes