如何解决Golang WAF服务的竞态条件问题?

huangapple go评论69阅读模式
英文:

How can I solve race condition for Golang WAF service?

问题

首先,抱歉我的英文不好。

我尝试在Golang中开发一个Web应用防火墙(WAF)服务。所有的东西都存储在内存中的map[string]*Struct{}中。当请求到达时,我在处理函数中将请求头的主机(host)设置为map中的值。

host,err := GetHost(r.Host)

func GetHost(host string) (*Host,error){
    split, _, err := net.SplitHostPort(host)
    if err == nil {
	    host = split
    }
    if data, val := hosts[host]; val { 
	    return data, nil
    }
    return nil,errors.New("host not found!")
}
//hosts是一个包含所有主机的映射,键是主机,值是主机结构体。

问题是,当服务接收到大量请求时,映射会出错。例如;
主机是example.com,但hosts["example.com"]返回的是另一个不相关的值。

type Server struct {
	mu   sync.RWMutex
	Host *models.Host
}
func (c *Server) handler(handler http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
	mutex.Lock()
    defer mutex.Unlock()
	var err error
	c.Host, err = GetHost(r.Host)
	if err != nil {
		w.WriteHeader(http.StatusBadGateway)
		w.Write([]byte(r.Host + " not found!!"))
		return
	}
	//继续处理..编辑部分

因此,我尝试使用mutexwg来解决这个问题,但没有成功。欢迎任何建议。

英文:

Firstly, sorry all for bad English.

I try to develop a WAF(Web Application Firewall) service in Golang. Everything is in the map[string]*Struct{} in memory. When request has come, I set request header's host to map in handler function.

host,err := GetHost(r.Host)

func GetHost(host string) (*Host,error){
    split, _, err := net.SplitHostPort(host)
    if err == nil {
	    host = split
    }
    if data, val := hosts[host]; val { 
	    return data, nil
    }
    return nil,errors.New("host not found!)
}
//hosts is a map for all host, key is host and value is host struct.

The problem is, map is messing when service has got a lot of request. For example;
host is example.com but hosts["example.com"] gives an another value which is irrelevant.

type Server struct {
	mu   sync.RWMutex
	Host *models.Host
}
func (c *Server) handler(handler http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
	mutex.Lock()
    defer mutex.Unlock()
	var err error
	c.Host, err = GetHost(r.Host)
	if err != nil {
		w.WriteHeader(http.StatusBadGateway)
		w.Write([]byte(r.Host + " not found!!"))
		return
	}
	//it's going on..Edited part

So, I try to use mutex and wg for solving this problem but it didn't work. Open for any suggestion.

答案1

得分: 1

你在考虑的是正确的,你需要对来自HTTP处理程序的共享状态进行串行访问(因为服务器会在专用的goroutine中处理每个HTTP请求)。否则,你的程序可能会遇到同步错误,可能在执行过程中表现为数据竞争,就像你似乎已经遇到的那样;Go工具链提供的竞争检测器可能已经发现了这个问题。

可以说,序列化对共享状态的访问最简单的方法是使用一些互斥锁。然而,你需要小心。你对mutex.Unlock的延迟调用是有问题的,至少有一个,可能有两个原因:

  1. 通常情况下,你应该尽量使“临界区”(由LockUnlock调用包围的代码部分)尽可能“小”和“廉价”。简而言之,临界区应该只进行内存处理,而不是I/O操作。在这里,锁需要在每个请求的整个处理过程中保持,这可能会导致服务器的大量争用。
  2. 虽然你在处理程序中省略了代码的结尾,但我猜测(?)你后面还会获取锁来更新映射(如果当前请求的主机之前没有遇到过)。但是,由于sync包导出的互斥锁类型都不可重入,你很可能会遇到死锁:由于对Unlock的调用被延迟执行,互斥锁只会在处理程序终止时释放。

一种解决方案是避免使用defer,并将临界区限制在对GetHost函数的调用中。

另一个改进是消除全局状态,以提高可测试性等。你可以通过将hosts映射存储在Server结构体的字段中,并将GetHost声明为*Server的方法,使hosts映射不再是全局的。

type Server struct {
    mu    sync.RWMutex
    Host  *models.Host
    hosts map[string]*Host
}

func (srv *Server) GetHost(host string) (*Host, error) {
    split, _, err := net.SplitHostPort(host)
    if err == nil {
        host = split
    }
    if data, exists := srv.hosts[host]; exists {
        return data, nil
    }
    return nil, errors.New("host not found!")
}

func (srv *Server) handler(handler http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        srv.mu.Lock()
        srv.Host, err := srv.GetHost(r.Host)
        srv.mu.Unlock()
        if err != nil {
            w.WriteHeader(http.StatusBadGateway)
            w.Write([]byte(r.Host + " not found!!"))
            return
        }
        // 可能需要再次获取和释放锁
        // 以进一步处理hosts映射
    })
}

我可能漏掉了一些东西,但必须承认,我不明白为每个请求更新Server结构体的Host字段的意义所在...

英文:

You're right in thinking that you need to serialise access to a shared state from a HTTP handler (because the server handles each HTTP request in a dedicated goroutine). Otherwise, your program would indeed suffer from a synchronization bug that would likely manifest itself as a data race during execution, as you seem to have experienced; the race detector provided with the Go toolchain would likely have picked up on it.

Arguably, the simplest way to serialize access to that shared state is to use some mutex. However, you need to be careful. Your deferred call to mutex.Unlock is problematic, for at least one, possibly two reasons:

  1. In general, you should endeavour to keep the critical section (the part of your code that surrounded by a call to Lock and Unlock) as "small" and "cheap" as possible. In short, the critical section should only do in-memory treatments, not I/O stuff. Here, the lock needs to be held during the entire treatment of each request, which is likely to cause a great deal of contention for your server.
  2. Although you omitted the end of the code in your handler, I'm guessing (?) that you also acquire the lock later in order to update the map (if the current request's host hasn't been encountered before). But, because none of the mutex types exported by package sync are re-entrant, you're likely to get a deadlock: due to the call to Unlock being deferred, the mutex will only get released when your handler terminates.

One solution consists in eschewing defer and restricting the critical section to the call to your GetHost function.

Another improvement would be to eliminate global state, for better testability, etc. You could make your hosts map non-global by simply smuggling storing it in a field of your Server struct and declaring GetHost as a method on *Server.

type Server struct {
    mu   sync.RWMutex
    Host *models.Host
    hosts map[string]*Host
}

func (srv *Server) GetHost(host string) (*Host, error){
    split, _, err := net.SplitHostPort(host)
    if err == nil {
        host = split
    }
    if data, exists := srv.hosts[host]; exists { 
        return data, nil
    }
    return nil, errors.New("host not found!")
}

func (srv *Server) handler(handler http.Handler) http.Handler {
	return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		mutex.Lock()
		srv.Host, err := srv.GetHost(r.Host)
		mutex.Unlock()
		if err != nil {
			w.WriteHeader(http.StatusBadGateway)
			w.Write([]byte(r.Host + " not found!!"))
			return
		}
        // possibly acquire and release the lock again
        // for further treatment of the hosts map
	})
}

I may be missing something, but must admit I don't see the point in updating the Host field of your Server struct for each request...

huangapple
  • 本文由 发表于 2021年6月16日 19:19:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/68001657.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定