为什么httputil.NewSingleHostReverseProxy在某些www网站上会引发错误?

huangapple go评论81阅读模式
英文:

Why is httputil.NewSingleHostReverseProxy causing an error on some www sites?

问题

在下面的示例中:

package main

import (
	"fmt"
	"log"
	"net/http"
	"net/http/httputil"
	"net/url"
)

func main() {
	p := new(Proxy)
	//host := "www.google.com" // 正常工作
	host := "www.apple.com" // 报错
	u, err := url.Parse(fmt.Sprintf("http://%v/", host))
	if err != nil {
		log.Printf("解析URL错误")
	}
	p.proxy = httputil.NewSingleHostReverseProxy(u)
	http.Handle("/", p)
	log.Fatal(http.ListenAndServe("localhost:8000", nil))
}

type Proxy struct {
	proxy *httputil.ReverseProxy
}

func (p *Proxy) ServeHTTP(w http.ResponseWriter, r *http.Request) {
	p.proxy.ServeHTTP(w, r)
}

www.google.com替换为www.apple.com时,在Chrome中指向localhost:8000会出现以下错误:

无效的URL

请求的URL“/”无效。
参考#9.a61a32b8.1438231668.41733295

进一步调查发现,对于www.apple.com,我得到了以下结果:

➜  ~  curl --ipv4 -v localhost:8000

< HTTP/1.1 400 Bad Request
< Content-Length: 194
< Content-Type: text/html
< Date: Thu, 30 Jul 2015 05:20:38 GMT
< Expires: Thu, 30 Jul 2015 05:20:38 GMT
< Mime-Version: 1.0
* Server AkamaiGHost is not blacklisted
< Server: AkamaiGHost
<
<HTML><HEAD>
<TITLE>Invalid URL</TITLE>
</HEAD><BODY>
<H1>Invalid URL</H1>
The requested URL "&#47;", is invalid.<p>
Reference&#32;&#35;9&#46;65b454b8&#46;1438233638&#46;1f1b8a40
</BODY></HTML>
* Connection #0 to host localhost left intact

而对于www.google.com

➜  ~  curl --ipv4 -v localhost:8000

< HTTP/1.1 302 Found
< Alternate-Protocol: 80:quic,p=0
< Cache-Control: private
< Content-Length: 219
< Content-Type: text/html; charset=UTF-8
< Date: Thu, 30 Jul 2015 05:03:16 GMT
< Location: http://www.google.com/
* Server sffe is not blacklisted
< Server: sffe
< X-Content-Type-Options: nosniff
< X-Xss-Protection: 1; mode=block
<
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>
* Connection #0 to host localhost left intact

现在,当我使用apple.com而不是www.apple.com时,一切正常:

➜  ~  curl --ipv4 -v localhost:8000

< HTTP/1.1 301 Moved Permanently
< Content-Type: text/html
< Date: 
< Location: http://www.apple.com/
< Referer: 
* Server  is not blacklisted
< Server: 
< Content-Length: 0
< 
* Connection #0 to host localhost left intact

发生了什么?

英文:

In the example below:

package main

import (
	&quot;fmt&quot;
	&quot;log&quot;
	&quot;net/http&quot;
	&quot;net/http/httputil&quot;
	&quot;net/url&quot;
)

func main() {
	p := new(Proxy)
	//host := &quot;www.google.com&quot; // WORKS AS EXPECTED
	host := &quot;www.apple.com&quot; // GIVES AN ERROR
	u, err := url.Parse(fmt.Sprintf(&quot;http://%v/&quot;, host))
	if err != nil {
		log.Printf(&quot;Error parsing URL&quot;)
	}
	p.proxy = httputil.NewSingleHostReverseProxy(u)
	http.Handle(&quot;/&quot;, p)
	log.Fatal(http.ListenAndServe(&quot;localhost:8000&quot;, nil))
}

type Proxy struct {
	proxy *httputil.ReverseProxy
}

func (p *Proxy) ServeHTTP(w http.ResponseWriter, r *http.Request) {
	p.proxy.ServeHTTP(w, r)
}

swapping 'www.google.com' with 'www.apple.com' results in this error when pointing Chrome to 'localhost:8000':

> Invalid URL
>
> The requested URL "/", is invalid.
> Reference #9.a61a32b8.1438231668.41733295

Doing a little bit more digging, for www.apple.com, I am getting:

➜  ~  curl --ipv4 -v localhost:8000

&lt; HTTP/1.1 400 Bad Request
&lt; Content-Length: 194
&lt; Content-Type: text/html
&lt; Date: Thu, 30 Jul 2015 05:20:38 GMT
&lt; Expires: Thu, 30 Jul 2015 05:20:38 GMT
&lt; Mime-Version: 1.0
* Server AkamaiGHost is not blacklisted
&lt; Server: AkamaiGHost
&lt; 
&lt;HTML&gt;&lt;HEAD&gt;
&lt;TITLE&gt;Invalid URL&lt;/TITLE&gt;
&lt;/HEAD&gt;&lt;BODY&gt;
&lt;H1&gt;Invalid URL&lt;/H1&gt;
The requested URL &quot;&amp;#47;&quot;, is invalid.&lt;p&gt;
Reference&amp;#32;&amp;#35;9&amp;#46;65b454b8&amp;#46;1438233638&amp;#46;1f1b8a40
&lt;/BODY&gt;&lt;/HTML&gt;
* Connection #0 to host localhost left intact

and for www.google.com:

➜  ~  curl --ipv4 -v localhost:8000

&lt; HTTP/1.1 302 Found
&lt; Alternate-Protocol: 80:quic,p=0
&lt; Cache-Control: private
&lt; Content-Length: 219
&lt; Content-Type: text/html; charset=UTF-8
&lt; Date: Thu, 30 Jul 2015 05:03:16 GMT
&lt; Location: http://www.google.com/
* Server sffe is not blacklisted
&lt; Server: sffe
&lt; X-Content-Type-Options: nosniff
&lt; X-Xss-Protection: 1; mode=block
&lt; 
&lt;HTML&gt;&lt;HEAD&gt;&lt;meta http-equiv=&quot;content-type&quot; content=&quot;text/html;charset=utf-8&quot;&gt;
&lt;TITLE&gt;302 Moved&lt;/TITLE&gt;&lt;/HEAD&gt;&lt;BODY&gt;
&lt;H1&gt;302 Moved&lt;/H1&gt;
The document has moved
&lt;A HREF=&quot;http://www.google.com/&quot;&gt;here&lt;/A&gt;.
&lt;/BODY&gt;&lt;/HTML&gt;
* Connection #0 to host localhost left intact

Now when I use 'apple.com' instead of 'www.apple.com', things work fine:

➜  ~  curl --ipv4 -v localhost:8000

&lt; HTTP/1.1 301 Moved Permanently
&lt; Content-Type: text/html
&lt; Date: 
&lt; Location: http://www.apple.com/
&lt; Referer: 
* Server  is not blacklisted
&lt; Server: 
&lt; Content-Length: 0
&lt; 
* Connection #0 to host localhost left intact

What's going on?

答案1

得分: 7

这里的问题是虚拟服务器;你连接的一些网站不知道你请求的是哪个域名(即Host HTTP头字段设置为localhost:8000,而不是例如www.apple.com)。为了解决这个问题,反向代理必须重写Host头。

不幸的是,httputil.NewSingleHostReverseProxy没有提供一种简单的方法来进行重写,所以下面我添加的大部分内容都是从net/http/httputil源代码中复制过来的:

package main

import (
	"fmt"
	"log"
	"net/http"
	"net/http/httputil"
	"net/url"
	"strings"
)

func main() {
	host := "www.apple.com"
	u, err := url.Parse(fmt.Sprintf("http://%v/", host))
	if err != nil {
		log.Printf("Error parsing URL")
	}

	targetQuery := u.RawQuery
	p := &httputil.ReverseProxy{
		Director: func(req *http.Request) {
			req.Host = host
			req.URL.Scheme = u.Scheme
			req.URL.Host = u.Host
			req.URL.Path = singleJoiningSlash(u.Path, req.URL.Path)
			if targetQuery == "" || req.URL.RawQuery == "" {
				req.URL.RawQuery = targetQuery + req.URL.RawQuery
			} else {
				req.URL.RawQuery = targetQuery + "&" + req.URL.RawQuery
			}
		},
	}

	http.Handle("/", p)
	log.Fatal(http.ListenAndServe("localhost:8000", nil))
}

func singleJoiningSlash(a, b string) string {
	aslash := strings.HasSuffix(a, "/")
	bslash := strings.HasPrefix(b, "/")
	switch {
	case aslash && bslash:
		return a + b[1:]
	case !aslash && !bslash:
		return a + "/" + b
	}
	return a + b
}
英文:

The problem here is virtual servers; some of the web sites that you are connecting to don't know what domain you are requesting (i.e. the Host HTTP header field is set to localhost:8000, not, for example, www.apple.com). To fix this, the reverse proxy must rewrite the Host header.

Unfortunately, httputil.NewSingleHostReverseProxy doesn't provide an easy way to do the rewriting, so most of what I have added below has been copied from the net/http/httputil source code:

package main
import (
&quot;fmt&quot;
&quot;log&quot;
&quot;net/http&quot;
&quot;net/http/httputil&quot;
&quot;net/url&quot;
&quot;strings&quot;
)
func main() {
host := &quot;www.apple.com&quot;
u, err := url.Parse(fmt.Sprintf(&quot;http://%v/&quot;, host))
if err != nil {
log.Printf(&quot;Error parsing URL&quot;)
}
targetQuery := u.RawQuery
p := &amp;httputil.ReverseProxy{
Director: func(req *http.Request) {
req.Host = host
req.URL.Scheme = u.Scheme
req.URL.Host = u.Host
req.URL.Path = singleJoiningSlash(u.Path, req.URL.Path)
if targetQuery == &quot;&quot; || req.URL.RawQuery == &quot;&quot; {
req.URL.RawQuery = targetQuery + req.URL.RawQuery
} else {
req.URL.RawQuery = targetQuery + &quot;&amp;&quot; + req.URL.RawQuery
}
},
}
http.Handle(&quot;/&quot;, p)
log.Fatal(http.ListenAndServe(&quot;localhost:8000&quot;, nil))
}
func singleJoiningSlash(a, b string) string {
aslash := strings.HasSuffix(a, &quot;/&quot;)
bslash := strings.HasPrefix(b, &quot;/&quot;)
switch {
case aslash &amp;&amp; bslash:
return a + b[1:]
case !aslash &amp;&amp; !bslash:
return a + &quot;/&quot; + b
}
return a + b
}

huangapple
  • 本文由 发表于 2015年7月30日 13:07:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/31715545.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定