Go lang Capture Redirect urls and status codes with timeouts

huangapple go评论83阅读模式
英文:

Go lang Capture Redirect urls and status codes with timeouts

问题

我正在尝试向给定的URL发出请求,并捕获所跟随的重定向URL和它们的状态码。

我尝试寻找关于我的具体问题的答案 - 这个接近了。

然而,我还需要在整个连接上添加代理、用户代理和超时,即无论有多少重定向/代理延迟等,时间都不应超过X秒。

我已经通过设置请求头处理了用户代理,并通过将其添加到Transport结构中处理了代理。
我尝试探索重定向的CheckRedirect - 但那只给我URL,我还需要状态码,所以我不得不实现RoundTrip函数。

到目前为止,一切都运行良好 - 除了超时。
这是我目前的代码 - playground链接
我也在这里粘贴了相关的代码 - playground有一个带有模拟重定向服务器的完整版本 - 不幸的是,它会因为playground的限制而导致连接被拒绝,但在本地完全可以工作。

type Redirect struct {
    StatusCode int
    URL string
}

type TransportWrapper struct {
    Transport http.RoundTripper
    Url string
    Proxy string
    UserAgent string
    TimeoutInSeconds int
    FinalUrl string
    RedirectUrls []Redirect
}
// 实现Round Tripper以捕获中间URL
func (t *TransportWrapper) RoundTrip(req *http.Request) (*http.Response, error) {
    transport := t.Transport
    if transport == nil {
        transport = http.DefaultTransport
    }

    resp, err := transport.RoundTrip(req)
    if err != nil {
        return resp, err
    }

    // 记录重定向
    if resp.StatusCode >= 300 && resp.StatusCode <= 399 {
        t.RedirectUrls = append(
            t.RedirectUrls, Redirect{resp.StatusCode, req.URL.String()},
        )
    }
    return resp, err
}

func (t *TransportWrapper) Do() (*http.Response, error) {
    t.Transport = &http.Transport{}
    if t.Proxy != "" {
        proxyUrl, err := url.Parse(t.Proxy)
        if err != nil {
            return nil, err
        }

        t.Transport = &http.Transport{Proxy:http.ProxyURL(proxyUrl)}
        // 帮助
        // 为什么这会失败
        // t.Transport.Proxy = http.ProxyUrl(proxyUrl)
    }

    client := &http.Client{
        Transport: t, // 由于我实现了RoundTrip,所以可以传递这个
        // Timeout: t.TimeoutInSeconds * time.Second, // 这个会失败
    }

    req, err := http.NewRequest("GET", t.Url, nil)
    if err != nil {
        return nil, err
    }

    if t.UserAgent != "" {
        req.Header.Set("User-Agent", t.UserAgent)
    }

    resp, err := client.Do(req)
    if err != nil {
        return nil, err
    }

    t.FinalUrl = resp.Request.URL.String()
    return resp, nil
}

func startClient() {
    t := &TransportWrapper {
        Url: "http://127.0.0.1:8080/temporary/redirect?num=5",
        // Proxy
        // UserAgent
        // Timeout
    }

    _, err := t.Do()
    if err != nil {
        panic(err)
    }

    fmt.Printf("Intermediate Urls: \n")
    for i, v := range t.RedirectUrls {
        fmt.Printf("[%d] %s\n", i, v)
    }

}

问题1:如何添加超时?

尝试1:

client := &http.Client{ Transport: t, Timeout: myTimeout }

但是Go报错说“*main.TransportWrapper不支持CancelRequest;不支持超时”。

尝试2:

// 添加CancelRequest
func (t *TransportWrapper) CancelRequest(req *http.Request) {
    dt := http.DefaultTransport
    dt.CancelRequest(req)
}

但是Go报错说“dt.CancelRequest未定义(类型http.RoundTripper没有CancelRequest字段或方法)”。

如何在不做太多工作的情况下实现CancelRequest并让默认的CancelRequest接管?

问题2:我是否走了一条错误的道路,是否有解决这个问题的替代方法,

给定一个URL、代理、用户代理和超时 - 返回响应以及获取响应的重定向URL和它们的状态码。

我希望我表达得恰当。

谢谢。

英文:

I am trying to make a request to a given url, and capture the redirect urls and their status codes that were followed.

I've tried looking for an answer to my specific question - this came close .

However, I need to also add proxy, user agent and timeouts on the entire connection i.e. No matter how many redirects / proxy latency etc, the amount of time should not exceed X seconds.

I've handled user-agent by setting request header, and proxy by adding it to the Transport struct.
I tried exploring CheckRedirect for redirects - but that gives me only Url, I needed the status code as well, so I had to implement the RoundTrip function.

Everything works well as of now - except for the Timeout.
Heres what I have so far - playground link
I've pasted the relevant code here as well - the playground has a full version with a mock redirect server in place - Unfortunately it panics saying connection refused possibly because of playground restrictions - It works completely locally though.

type Redirect struct {
    StatusCode int
    URL string
}

type TransportWrapper struct {
    Transport http.RoundTripper
    Url string
    Proxy string
    UserAgent string
    TimeoutInSeconds int
    FinalUrl string
    RedirectUrls []Redirect
}
// Implementing Round Tripper to capture intermediate urls
func (t *TransportWrapper) RoundTrip(req *http.Request) (*http.Response, error) {
    transport := t.Transport
    if transport == nil {
        transport = http.DefaultTransport
    }

    resp, err := transport.RoundTrip(req)
    if err != nil {
        return resp, err
    }

    // Remember redirects
    if resp.StatusCode &gt;= 300 &amp;&amp; resp.StatusCode &lt;= 399 {
        t.RedirectUrls = append(
            t.RedirectUrls, Redirect{resp.StatusCode, req.URL.String()},
        )
    }
    return resp, err
}

func (t *TransportWrapper) Do() (*http.Response, error) {
    t.Transport = &amp;http.Transport{}
    if t.Proxy != &quot;&quot; {
        proxyUrl, err := url.Parse(t.Proxy)
        if err != nil {
            return nil, err
        }

        t.Transport = &amp;http.Transport{Proxy:http.ProxyURL(proxyUrl)}
        // HELP
        // Why does this fail
        // t.Transport.Proxy = http.ProxyUrl(proxyUrl)
    }

    client := &amp;http.Client{
        Transport: t, // Since I&#39;ve implemented RoundTrip I can pass this
        // Timeout: t.TimeoutInSeconds * time.Second, // This Fails 
    }

    req, err := http.NewRequest(&quot;GET&quot;, t.Url, nil)
    if err != nil {
        return nil, err
    }

    if t.UserAgent != &quot;&quot; {
        req.Header.Set(&quot;User-Agent&quot;, t.UserAgent)
    }

    resp, err := client.Do(req)
    if err != nil {
        return nil, err
    }

    t.FinalUrl = resp.Request.URL.String()
    return resp, nil
}

func startClient() {
    t := &amp;TransportWrapper {
        Url: &quot;http://127.0.0.1:8080/temporary/redirect?num=5&quot;,
        // Proxy
        // UserAgent
        // Timeout
    }

    _, err := t.Do()
    if err != nil {
        panic(err)
    }

    fmt.Printf(&quot;Intermediate Urls: \n&quot;)
    for i, v := range t.RedirectUrls {
        fmt.Printf(&quot;[%d] %s\n&quot;, i, v)
    }

}

Question 1 : How do I Add the timeout ?

Attempt #1 :

client := &amp;http.Client{ Transport: t, Timeout: myTimeout }

But Go complains saying " *main.TransportWrapper doesn't support CancelRequest; Timeout not supported "

Attempt #2 :

// Adding a CancelRequest
func (t *TransportWrapper) CancelRequest(req *http.Request) {
    dt := http.DefaultTransport
    dt.CancelRequest(req)
}

But Go complains saying "dt.CancelRequest undefined (type http.RoundTripper has no field or method CancelRequest)"

How do I implement this CancelRequest without doing too much and just let default CancelRequest take over ?

Question 2 : Have I gone down a bad path and is there an alternative to solving the problem ,

Given a Url, Proxy, UserAgent and Timeout - return the response along with the redirect urls and their status codes followed to get there.

I hope I've worded this appropriately.

Thanks

答案1

得分: 4

已经有一个用于检查重定向的钩子函数 Client.CheckRedirect

你可以提供一个回调函数来实现你想要的功能。

如果你真的想要创建自己的传输层来扩展其他功能,你需要提供一个 CancelRequest 方法来处理 Client.Timeout

func (t *TransportWrapper) CancelRequest(req *Request) {
    t.Transport.CancelRequest(req)
}

更常见的做法是嵌入 Transport,这样所有的方法和字段都会自动提升。但是你应该避免在传输层中使用可写字段,因为它被期望可以安全地并发使用,否则你应该使用互斥锁来保护所有的访问,或者确保它只在一个 goroutine 中使用。

一个最简单的示例代码如下:

type TransportWrapper struct {
    *http.Transport
    RedirectUrls []Redirect
}

func (t *TransportWrapper) RoundTrip(req *http.Request) (*http.Response, error) {
    transport := t.Transport
    if transport == nil {
        transport = http.DefaultTransport.(*http.Transport)
    }

    resp, err := transport.RoundTrip(req)
    if err != nil {
        return resp, err
    }

    // 记录重定向
    if resp.StatusCode >= 300 && resp.StatusCode <= 399 {
        fmt.Println("redirected")
        t.RedirectUrls = append(
            t.RedirectUrls, Redirect{resp.StatusCode, req.URL.String()},
        )
    }
    return resp, err
}

然后你可以在客户端中使用超时:

client := &http.Client{
    Transport: &TransportWrapper{
        Transport: http.DefaultTransport.(*http.Transport),
    },
    Timeout: 5 * time.Second,
}
英文:

There is already a hook for checking redirects, Client.CheckRedirect.

You can supply a callback to do what you want.

If you really want to create you're own transport to extend other functionality, you would need to supply a CancelRequest method like the error says to handle Client.Timeout.

func (t *TransportWrapper) CancelRequest(req *Request) {
    t.Transport.CancelRequest(req)
}

More commonly, you would embed the Transport, so that all the methods and fields are automatically promoted. You should avoid writable fields in the transport however, since it's expected to be safe to use concurrently, otherwise you should have all access protected with a mutex, or you must make sure it's only used in one goroutine.

A minimal example would look like:

type TransportWrapper struct {
	*http.Transport
	RedirectUrls []Redirect
}

func (t *TransportWrapper) RoundTrip(req *http.Request) (*http.Response, error) {
	transport := t.Transport
	if transport == nil {
		transport = http.DefaultTransport.(*http.Transport)
	}

	resp, err := transport.RoundTrip(req)
	if err != nil {
		return resp, err
	}

	// Remember redirects
	if resp.StatusCode &gt;= 300 &amp;&amp; resp.StatusCode &lt;= 399 {
		fmt.Println(&quot;redirected&quot;)
		t.RedirectUrls = append(
			t.RedirectUrls, Redirect{resp.StatusCode, req.URL.String()},
		)
	}
	return resp, err
}

And you can then use the timeout in the client:

client := &amp;http.Client{
	Transport: &amp;TransportWrapper{
		Transport: http.DefaultTransport.(*http.Transport),
	},
	Timeout: 5 * time.Second,
}

huangapple
  • 本文由 发表于 2015年8月22日 17:51:57
  • 转载请务必保留本文链接:https://go.coder-hub.com/32154712.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定