太多的重定向,但是通过什么路径?

huangapple go评论87阅读模式
英文:

Too many redirects, but through what route?

问题

我有一个基于goquery的简单网络爬虫/蜘蛛,它使用net/http。它工作得很好,直到我遇到一个有太多重定向的网站。

Get http://www.example.com/some/path.html: 在10次重定向后停止

但是为什么会这样?它是重定向到自身了吗?它把我扔进了某个蜘蛛监狱吗?我想知道我被重定向到了哪些URL,以及按照什么顺序。

引发错误的函数似乎知道这一点,因为它在检查请求切片的长度,但我不想自己编辑net/http包。

这是来自http://golang.org/src/pkg/net/http/client.go的那个函数:

func defaultCheckRedirect(req *Request, via []*Request) error {
	if len(via) >= 10 {
		return errors.New("stopped after 10 redirects")
	}
	return nil
}
英文:

I've got a simple web scraper/spider based on goquery, which in turn uses net/http. It works great, until I hit a website with too many redirects.

>Get http://www.example.com/some/path.html: stopped after 10 redirects

But why? Did it redirect to itself? Did it throw me into some spider jail? I want to know to what url's I got redirected, and in what order.

The function giving the error seems to know this, since it's checking the length of a slice of requests, but I don't really want to edit the net/http package myself.

Here's that function from http://golang.org/src/pkg/net/http/client.go

func defaultCheckRedirect(req *Request, via []*Request) error {
	if len(via) >= 10 {
		return errors.New("stopped after 10 redirects")
	}
	return nil
}

答案1

得分: 2

你可以将自己的函数传递给http.Client,例如:

client := &http.Client{
	CheckRedirect: func(req *http.Request, via []*http.Request) error {
		log.Println("redirect", req.URL)
		if len(via) >= 10 {
			return errors.New("stopped after 10 redirects")
		}
		return nil
	},
}

这段代码中,通过在http.Client中设置CheckRedirect字段为一个函数,你可以自定义重定向行为。在这个例子中,函数会在每次重定向时打印重定向的URL,并且如果重定向次数超过10次,会返回一个错误。

英文:

You can pass your own function to http.Client, for example:

client := &http.Client{
	CheckRedirect: func(req *Request, via []*Request) error {
		log.Println("redirect", req.URL)
		if len(via) >= 10 {
			return errors.New("stopped after 10 redirects")
		}
		return nil
	},
}

huangapple
  • 本文由 发表于 2014年8月14日 23:52:48
  • 转载请务必保留本文链接:https://go.coder-hub.com/25312385.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定