Golang请求头并发读写

huangapple go评论84阅读模式
英文:

golang request header concurrent read and write

问题

Go env:

> GOARCH="amd64"
>
>GOBIN="/root/"
>
>GOEXE=""
>
>GOHOSTARCH="amd64"
>
>GOHOSTOS="linux"
>
>GOOS="linux"
>
>GOPATH="/data/workspace/kubernetes"
>
>GORACE=""
>
>GOROOT="/usr/local/go"
>
>GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
>
>GO15VENDOREXPERIMENT="1"
>
>CC="gcc"
>
>GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0"
>
>CXX="g++"
>
>CGO_ENABLED="1"

Go版本:
>go version go1.6.3 linux/amd64

这个问题发生在一个负载较高的“性能测试环境”kube-apiserver上。kube-apiserver发生了恐慌并退出:

fatal error: concurrent map read and map write

goroutine 77930636 [running]:
runtime.throw(0x2f4c4c0, 0x21)
    /root/.gvm/gos/go1.6.3/src/runtime/panic.go:547 +0x90 fp=0xca67b477f0 sp=0xca67b477d8
runtime.mapaccess1_faststr(0x2a8e520, 0xc9e29000f0, 0x2c11220, 0xa, 0x433e360)
    /root/.gvm/gos/go1.6.3/src/runtime/hashmap_fast.go:202 +0x5b fp=0xca67b47850 sp=0xca67b477f0
k8s.io/kubernetes/pkg/httplog.(*respLogger).Log(0xcbddf2ae70)
       /data/gerrit/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/httplog/log.go:180 +0x43d fp=0xca67b47af8 sp=0xca67b47850
k8s.io/kubernetes/pkg/apiserver.RecoverPanics.func1(0x7f099f157090, 0xcbddf2ae70, 0xcd7569e380)
     /data/gerrit/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/apiserver/handlers.go:174 +0x15d fp=0xca67b47b50 sp=0xca67b47af8
    net/http.HandlerFunc.ServeHTTP(0xc821a4eac0, 0x7f099f157058, 0xca0f4eb450, 0xcd7569e380)
    /root/.gvm/gos/go1.6.3/src/net/http/server.go:1618 +0x3a fp=0xca67b47b70 sp=0xca67b47b50
net/http.serverHandler.ServeHTTP(0xc8215a7b80, 0x7f099f157058, 0xca0f4eb450, 0xcd7569e380)
    /root/.gvm/gos/go1.6.3/src/net/http/server.go:2081 +0x19e fp=0xca67b47bd0 sp=0xca67b47b70
net/http.(*conn).serve(0xc8b5d6b980)
    /root/.gvm/gos/go1.6.3/src/net/http/server.go:1472 +0xf2e fp=0xca67b47f98 sp=0xca67b47bd0
runtime.goexit()
    /root/.gvm/gos/go1.6.3/src/runtime/asm_amd64.s:1998 +0x1 fp=0xca67b47fa0 sp=0xca67b47f98
created by net/http.(*Server).Serve
    /root/.gvm/gos/go1.6.3/src/net/http/server.go:2137 +0x44e

对应的源代码:

pkg/apiserver/handlers.go

145 func RecoverPanics(handler http.Handler) http.Handler {
146     return http.HandlerFunc(func(w http.ResponseWriter, req *http.Request) {
147         defer func() {
148             if x := recover(); x != nil {
149                 http.Error(w, "apis panic. Look in log for details.", http.StatusInternalServerError)
150                 glog.Errorf("APIServer panic'd on %v %v: %v\n%s\n", req.Method, req.RequestURI, x, debug.Stack())
151             }
152         }()
153         defer httplog.NewLogged(req, &w).StacktraceWhen(
        httplog.StatusIsNot(
            http.StatusOK,
            http.StatusCreated,
            http.StatusAccepted,
            http.StatusBadRequest,
            http.StatusMovedPermanently,
            http.StatusTemporaryRedirect,
            http.StatusConflict,
            http.StatusNotFound,
            http.StatusUnauthorized,
            http.StatusForbidden,
            errors.StatusUnprocessableEntity,
            http.StatusSwitchingProtocols,
            http.StatusRequestTimeout,
            errors.StatusTooManyRequests,
        ),
    170 ).Log()

    // Dispatch to the internal handler
    handler.ServeHTTP(w, req)
174 })
}

pkg/httplog/log.go:

159 func (rl *respLogger) Log() {
160     latency := time.Since(rl.startTime)
161     if glog.V(2) {
162         extraInfo := ""
163         if latency >= time.Millisecond*200 && latency < time.Second {
            extraInfo = fmt.Sprintf("%d00.Millisecond", latency/(time.Millisecond*100))
        } else if latency >= time.Second && latency < time.Minute { // Warning
            extraInfo = fmt.Sprintf("%d.Second", latency/(time.Second))
        } else if latency >= time.Minute { // nce will timeout
            extraInfo = fmt.Sprintf("%d.Minutes", latency/(time.Minute))
        }
        method := rl.req.Method
        if len(rl.req.Header["Detailed-Method"]) > 0 {
            method = rl.req.Header["Detailed-Method"][0]
        }

        remoteIP := rl.getXForwardIPAdress(rl.req)


        if !rl.hijacked {
            //glog.InfoDepth(1, fmt.Sprintf("%s %s: (%v) %v%v%v [%s %s]", rl.req.Method, rl.req.RequestURI, latency, rl.status, rl.statusStack, rl.addedInfo, rl.req.Header["User-Agent"], rl.req.RemoteAddr))
    180         glog.InfoDepth(1, fmt.Sprintf("%v %s %s: (%sms) %v%v [%s %s]-%s %v", rl.req.Header["X-Requestid"], method, rl.req.RequestURI, GetMilliLatency(latency), rl.status, rl.addedInfo, rl.req.Header["User-Agent"], remoteIP, extraInfo, rl.statusStack))
        } else {
            //glog.InfoDepth(1, fmt.Sprintf("%s %s: (%v) hijacked [%s %s]", rl.req.Method, rl.req.RequestURI, latency, rl.req.Header["User-Agent"], rl.req.RemoteAddr))
            glog.InfoDepth(1, fmt.Sprintf("%v %s %s: (%sms) hijacked [%s %s]-%s", rl.req.Header["X-Requestid"], method, rl.req.RequestURI, GetMilliLatency(latency), rl.req.Header["User-Agent"], remoteIP, extraInfo))
        }
    }
}

在handler.ServeHTTP中,我改变了request.Header。我找不到原因:

(1) "serveHTTP"和"defer log"以串行方式工作。因此不存在"并发读写"。

(2) 即使存在"并发读写",也存在"recover"函数来处理恐慌,所以kube-apiserver不应该退出。

这个问题花费了我很多时间,谁能帮助我?谢谢。

英文:

Go env:

> GOARCH="amd64"
>
>GOBIN="/root/"
>
>GOEXE=""
>
>GOHOSTARCH="amd64"
>
>GOHOSTOS="linux"
>
>GOOS="linux"
>
>GOPATH="/data/workspace/kubernetes"
>
>GORACE=""
>
>GOROOT="/usr/local/go"
>
>GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
>
>GO15VENDOREXPERIMENT="1"
>
>CC="gcc"
>
>GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0"
>
>CXX="g++"
>
>CGO_ENABLED="1"

Go version:
>go version go1.6.3 linux/amd64

This issues is happend on a “performance test env” kube-apiserver with high load. kube-apiserver panic and exit:

fatal error: concurrent map read and map write
goroutine 77930636 [running]:
runtime.throw(0x2f4c4c0, 0x21)
/root/.gvm/gos/go1.6.3/src/runtime/panic.go:547 +0x90 fp=0xca67b477f0     sp=0xca67b477d8
runtime.mapaccess1_faststr(0x2a8e520, 0xc9e29000f0, 0x2c11220, 0xa, 0x433e360)
/root/.gvm/gos/go1.6.3/src/runtime/hashmap_fast.go:202 +0x5b fp=0xca67b47850 sp=0xca67b477f0
k8s.io/kubernetes/pkg/httplog.(*respLogger).Log(0xcbddf2ae70)
/data/gerrit/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/httplog/log.go:180 +0x43d fp=0xca67b47af8 sp=0xca67b47850
k8s.io/kubernetes/pkg/apiserver.RecoverPanics.func1(0x7f099f157090, 0xcbddf2ae70, 0xcd7569e380)
/data/gerrit/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/pkg/apiserver/handlers.go:174 +0x15d fp=0xca67b47b50 sp=0xca67b47af8
net/http.HandlerFunc.ServeHTTP(0xc821a4eac0, 0x7f099f157058, 0xca0f4eb450, 0xcd7569e380)
/root/.gvm/gos/go1.6.3/src/net/http/server.go:1618 +0x3a fp=0xca67b47b70 sp=0xca67b47b50
net/http.serverHandler.ServeHTTP(0xc8215a7b80, 0x7f099f157058, 0xca0f4eb450, 0xcd7569e380)
/root/.gvm/gos/go1.6.3/src/net/http/server.go:2081 +0x19e fp=0xca67b47bd0 sp=0xca67b47b70
net/http.(*conn).serve(0xc8b5d6b980)
/root/.gvm/gos/go1.6.3/src/net/http/server.go:1472 +0xf2e fp=0xca67b47f98 sp=0xca67b47bd0
runtime.goexit()
/root/.gvm/gos/go1.6.3/src/runtime/asm_amd64.s:1998 +0x1 fp=0xca67b47fa0 sp=0xca67b47f98
created by net/http.(*Server).Serve
/root/.gvm/gos/go1.6.3/src/net/http/server.go:2137 +0x44e

corresponding source code:

pkg/apiserver/handlers.go

145 func RecoverPanics(handler http.Handler) http.Handler {
146 return http.HandlerFunc(func(w http.ResponseWriter, req *http.Request) {
147 defer func() {
148     if x := recover(); x != nil {
149         http.Error(w, &quot;apis panic. Look in log for details.&quot;, http.StatusInternalServerError)
150         glog.Errorf(&quot;APIServer panic&#39;d on %v %v: %v\n%s\n&quot;, req.Method, req.RequestURI, x, debug.Stack())
151     }
152 }()
153 defer httplog.NewLogged(req, &amp;w).StacktraceWhen(
httplog.StatusIsNot(
http.StatusOK,
http.StatusCreated,
http.StatusAccepted,
http.StatusBadRequest,
http.StatusMovedPermanently,
http.StatusTemporaryRedirect,
http.StatusConflict,
http.StatusNotFound,
http.StatusUnauthorized,
http.StatusForbidden,
errors.StatusUnprocessableEntity,
http.StatusSwitchingProtocols,
http.StatusRequestTimeout,
errors.StatusTooManyRequests,
),
170 ).Log()
// Dispatch to the internal handler
handler.ServeHTTP(w, req)
174 })
}

pkg/httplog/log.go:

159 func (rl *respLogger) Log() {
160 latency := time.Since(rl.startTime)
161 if glog.V(2) {
162     extraInfo := &quot;&quot;
163     if latency &gt;= time.Millisecond*200 &amp;&amp; latency &lt; time.Second {
extraInfo = fmt.Sprintf(&quot;%d00.Millisecond&quot;, latency/(time.Millisecond*100))
} else if latency &gt;= time.Second &amp;&amp; latency &lt; time.Minute { // Warning
extraInfo = fmt.Sprintf(&quot;%d.Second&quot;, latency/(time.Second))
} else if latency &gt;= time.Minute { // nce will timeout
extraInfo = fmt.Sprintf(&quot;%d.Minutes&quot;, latency/(time.Minute))
}
method := rl.req.Method
if len(rl.req.Header[&quot;Detailed-Method&quot;]) &gt; 0 {
method = rl.req.Header[&quot;Detailed-Method&quot;][0]
}
remoteIP := rl.getXForwardIPAdress(rl.req)
if !rl.hijacked {
//glog.InfoDepth(1, fmt.Sprintf(&quot;%s %s: (%v) %v%v%v [%s %s]&quot;, rl.req.Method, rl.req.RequestURI, latency, rl.status, rl.statusStack, rl.addedInfo, rl.req.Header[&quot;User-Agent&quot;], rl.req.RemoteAddr))
180         glog.InfoDepth(1, fmt.Sprintf(&quot;%v %s %s: (%sms) %v%v [%s %s]-%s %v&quot;, rl.req.Header[&quot;X-Requestid&quot;], method, rl.req.RequestURI, GetMilliLatency(latency), rl.status, rl.addedInfo, rl.req.Header[&quot;User-Agent&quot;], remoteIP, extraInfo, rl.statusStack))
} else {
//glog.InfoDepth(1, fmt.Sprintf(&quot;%s %s: (%v) hijacked [%s %s]&quot;, rl.req.Method, rl.req.RequestURI, latency, rl.req.Header[&quot;User-Agent&quot;], rl.req.RemoteAddr))
glog.InfoDepth(1, fmt.Sprintf(&quot;%v %s %s: (%sms) hijacked [%s %s]-%s&quot;, rl.req.Header[&quot;X-Requestid&quot;], method, rl.req.RequestURI, GetMilliLatency(latency), rl.req.Header[&quot;User-Agent&quot;], remoteIP, extraInfo))
}
}

}

In handler.ServeHTTP, I changed request.Header. I cann't find the reason:

(1) "serveHTTP" and "defer log" works in a serial manner. So not exist "concurrent read and write"

(2) Even if exist "concurrent read and write", there exist "recover" func to deal with panics, so the kube-apiserver shoud not exit.

The question has cost me lots of time, who can help me? Thanks

答案1

得分: 1

我的猜测是你已经意识到全局超时处理程序实际上无法停止正在处理请求的goroutine。因此,当连接超时(默认为1分钟,除非你已经更改),它会返回一个错误。如果goroutine实际上在同一时间完成,它也可能尝试写入连接。我以为我们已经添加了一个锁来防止这个问题,但也许头部并不总是受到锁的保护。如果你能在一个干净的源代码树中重现这个问题,请在Kubernetes的GitHub存储库中提交一个问题。

英文:

My guess is that you've stepped on the fact that the global timeout handler can't actually stop the goroutines working on the request. So when the connection times out (1m unless you've changed it), it returns an error. If the goroutine actually completes at the same time, it could also attempt to write to the connection. I thought we had added a lock to prevent this from causing a problem, but perhaps the headers are not always protected by the lock. If you can reproduce this in a clean source tree, then please file an issue at the Kubernetes github repo.

huangapple
  • 本文由 发表于 2017年2月8日 11:49:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/42104348.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定