英文:
Why is httputil.NewSingleHostReverseProxy causing an error on some www sites?
问题
在下面的示例中:
package main
import (
"fmt"
"log"
"net/http"
"net/http/httputil"
"net/url"
)
func main() {
p := new(Proxy)
//host := "www.google.com" // 正常工作
host := "www.apple.com" // 报错
u, err := url.Parse(fmt.Sprintf("http://%v/", host))
if err != nil {
log.Printf("解析URL错误")
}
p.proxy = httputil.NewSingleHostReverseProxy(u)
http.Handle("/", p)
log.Fatal(http.ListenAndServe("localhost:8000", nil))
}
type Proxy struct {
proxy *httputil.ReverseProxy
}
func (p *Proxy) ServeHTTP(w http.ResponseWriter, r *http.Request) {
p.proxy.ServeHTTP(w, r)
}
将www.google.com
替换为www.apple.com
时,在Chrome中指向localhost:8000
会出现以下错误:
无效的URL
请求的URL“/”无效。
参考#9.a61a32b8.1438231668.41733295
进一步调查发现,对于www.apple.com
,我得到了以下结果:
➜ ~ curl --ipv4 -v localhost:8000
< HTTP/1.1 400 Bad Request
< Content-Length: 194
< Content-Type: text/html
< Date: Thu, 30 Jul 2015 05:20:38 GMT
< Expires: Thu, 30 Jul 2015 05:20:38 GMT
< Mime-Version: 1.0
* Server AkamaiGHost is not blacklisted
< Server: AkamaiGHost
<
<HTML><HEAD>
<TITLE>Invalid URL</TITLE>
</HEAD><BODY>
<H1>Invalid URL</H1>
The requested URL "/", is invalid.<p>
Reference #9.65b454b8.1438233638.1f1b8a40
</BODY></HTML>
* Connection #0 to host localhost left intact
而对于www.google.com
:
➜ ~ curl --ipv4 -v localhost:8000
< HTTP/1.1 302 Found
< Alternate-Protocol: 80:quic,p=0
< Cache-Control: private
< Content-Length: 219
< Content-Type: text/html; charset=UTF-8
< Date: Thu, 30 Jul 2015 05:03:16 GMT
< Location: http://www.google.com/
* Server sffe is not blacklisted
< Server: sffe
< X-Content-Type-Options: nosniff
< X-Xss-Protection: 1; mode=block
<
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>
* Connection #0 to host localhost left intact
现在,当我使用apple.com
而不是www.apple.com
时,一切正常:
➜ ~ curl --ipv4 -v localhost:8000
< HTTP/1.1 301 Moved Permanently
< Content-Type: text/html
< Date:
< Location: http://www.apple.com/
< Referer:
* Server is not blacklisted
< Server:
< Content-Length: 0
<
* Connection #0 to host localhost left intact
发生了什么?
英文:
In the example below:
package main
import (
"fmt"
"log"
"net/http"
"net/http/httputil"
"net/url"
)
func main() {
p := new(Proxy)
//host := "www.google.com" // WORKS AS EXPECTED
host := "www.apple.com" // GIVES AN ERROR
u, err := url.Parse(fmt.Sprintf("http://%v/", host))
if err != nil {
log.Printf("Error parsing URL")
}
p.proxy = httputil.NewSingleHostReverseProxy(u)
http.Handle("/", p)
log.Fatal(http.ListenAndServe("localhost:8000", nil))
}
type Proxy struct {
proxy *httputil.ReverseProxy
}
func (p *Proxy) ServeHTTP(w http.ResponseWriter, r *http.Request) {
p.proxy.ServeHTTP(w, r)
}
swapping 'www.google.com' with 'www.apple.com' results in this error when pointing Chrome to 'localhost:8000':
> Invalid URL
>
> The requested URL "/", is invalid.
> Reference #9.a61a32b8.1438231668.41733295
Doing a little bit more digging, for www.apple.com, I am getting:
➜ ~ curl --ipv4 -v localhost:8000
< HTTP/1.1 400 Bad Request
< Content-Length: 194
< Content-Type: text/html
< Date: Thu, 30 Jul 2015 05:20:38 GMT
< Expires: Thu, 30 Jul 2015 05:20:38 GMT
< Mime-Version: 1.0
* Server AkamaiGHost is not blacklisted
< Server: AkamaiGHost
<
<HTML><HEAD>
<TITLE>Invalid URL</TITLE>
</HEAD><BODY>
<H1>Invalid URL</H1>
The requested URL "&#47;", is invalid.<p>
Reference&#32;&#35;9&#46;65b454b8&#46;1438233638&#46;1f1b8a40
</BODY></HTML>
* Connection #0 to host localhost left intact
and for www.google.com:
➜ ~ curl --ipv4 -v localhost:8000
< HTTP/1.1 302 Found
< Alternate-Protocol: 80:quic,p=0
< Cache-Control: private
< Content-Length: 219
< Content-Type: text/html; charset=UTF-8
< Date: Thu, 30 Jul 2015 05:03:16 GMT
< Location: http://www.google.com/
* Server sffe is not blacklisted
< Server: sffe
< X-Content-Type-Options: nosniff
< X-Xss-Protection: 1; mode=block
<
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.com/">here</A>.
</BODY></HTML>
* Connection #0 to host localhost left intact
Now when I use 'apple.com' instead of 'www.apple.com', things work fine:
➜ ~ curl --ipv4 -v localhost:8000
< HTTP/1.1 301 Moved Permanently
< Content-Type: text/html
< Date:
< Location: http://www.apple.com/
< Referer:
* Server is not blacklisted
< Server:
< Content-Length: 0
<
* Connection #0 to host localhost left intact
What's going on?
答案1
得分: 7
这里的问题是虚拟服务器;你连接的一些网站不知道你请求的是哪个域名(即Host
HTTP头字段设置为localhost:8000
,而不是例如www.apple.com
)。为了解决这个问题,反向代理必须重写Host
头。
不幸的是,httputil.NewSingleHostReverseProxy
没有提供一种简单的方法来进行重写,所以下面我添加的大部分内容都是从net/http/httputil
源代码中复制过来的:
package main
import (
"fmt"
"log"
"net/http"
"net/http/httputil"
"net/url"
"strings"
)
func main() {
host := "www.apple.com"
u, err := url.Parse(fmt.Sprintf("http://%v/", host))
if err != nil {
log.Printf("Error parsing URL")
}
targetQuery := u.RawQuery
p := &httputil.ReverseProxy{
Director: func(req *http.Request) {
req.Host = host
req.URL.Scheme = u.Scheme
req.URL.Host = u.Host
req.URL.Path = singleJoiningSlash(u.Path, req.URL.Path)
if targetQuery == "" || req.URL.RawQuery == "" {
req.URL.RawQuery = targetQuery + req.URL.RawQuery
} else {
req.URL.RawQuery = targetQuery + "&" + req.URL.RawQuery
}
},
}
http.Handle("/", p)
log.Fatal(http.ListenAndServe("localhost:8000", nil))
}
func singleJoiningSlash(a, b string) string {
aslash := strings.HasSuffix(a, "/")
bslash := strings.HasPrefix(b, "/")
switch {
case aslash && bslash:
return a + b[1:]
case !aslash && !bslash:
return a + "/" + b
}
return a + b
}
英文:
The problem here is virtual servers; some of the web sites that you are connecting to don't know what domain you are requesting (i.e. the Host
HTTP header field is set to localhost:8000
, not, for example, www.apple.com
). To fix this, the reverse proxy must rewrite the Host
header.
Unfortunately, httputil.NewSingleHostReverseProxy
doesn't provide an easy way to do the rewriting, so most of what I have added below has been copied from the net/http/httputil
source code:
package main
import (
"fmt"
"log"
"net/http"
"net/http/httputil"
"net/url"
"strings"
)
func main() {
host := "www.apple.com"
u, err := url.Parse(fmt.Sprintf("http://%v/", host))
if err != nil {
log.Printf("Error parsing URL")
}
targetQuery := u.RawQuery
p := &httputil.ReverseProxy{
Director: func(req *http.Request) {
req.Host = host
req.URL.Scheme = u.Scheme
req.URL.Host = u.Host
req.URL.Path = singleJoiningSlash(u.Path, req.URL.Path)
if targetQuery == "" || req.URL.RawQuery == "" {
req.URL.RawQuery = targetQuery + req.URL.RawQuery
} else {
req.URL.RawQuery = targetQuery + "&" + req.URL.RawQuery
}
},
}
http.Handle("/", p)
log.Fatal(http.ListenAndServe("localhost:8000", nil))
}
func singleJoiningSlash(a, b string) string {
aslash := strings.HasSuffix(a, "/")
bslash := strings.HasPrefix(b, "/")
switch {
case aslash && bslash:
return a + b[1:]
case !aslash && !bslash:
return a + "/" + b
}
return a + b
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论