没有方案的URL会抛出错误。

huangapple go评论82阅读模式
英文:

Urls without scheme throw error

问题

我写了一个代理,问题是一些网站上的链接没有方案,例如google:

我通过Client.Do()获取URL
如何在Go中解决这样的URL?

英文:

I write a proxy and the problem is that some links on websites
don't have scheme for example google:

<a class="ab_dropdownlnk" href="//www.google.com/support/websearch/?source=g&hl=en">

I fetch the url via Client.Do()<BR>
How to resolve such urls in Go?

答案1

得分: 3

如果没有方案,则使用一个合理的默认值。例如,

package main

import (
	"fmt"
	"io/ioutil"
	"log"
	"net/http"
	"net/url"
)

func main() {
	href := "//www.google.com/support/websearch/?source=g&amp;amp;hl=en"
	url, err := url.Parse(href)
	if err != nil {
		log.Fatal(err)
	}
	if url.Scheme == "" {
		url.Scheme = "http"
	}
	req, err := http.NewRequest("GET", url.String(), nil)
	if err != nil {
		log.Fatal(err)
	}
	client := http.Client{}
	res, err := client.Do(req)
	if err != nil {
		log.Fatal(err)
	}
	websearch, err := ioutil.ReadAll(res.Body)
	res.Body.Close()
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf("%s\n", websearch)
}
英文:

If there is no scheme then use a sensible default. For example,

package main

import (
	&quot;fmt&quot;
	&quot;io/ioutil&quot;
	&quot;log&quot;
	&quot;net/http&quot;
	&quot;net/url&quot;
)

func main() {
	href := &quot;//www.google.com/support/websearch/?source=g&amp;amp;hl=en&quot;
	url, err := url.Parse(href)
	if err != nil {
		log.Fatal(err)
	}
	if url.Scheme == &quot;&quot; {
		url.Scheme = &quot;http&quot;
	}
	req, err := http.NewRequest(&quot;GET&quot;, url.String(), nil)
	if err != nil {
		log.Fatal(err)
	}
	client := http.Client{}
	res, err := client.Do(req)
	if err != nil {
		log.Fatal(err)
	}
	websearch, err := ioutil.ReadAll(res.Body)
	res.Body.Close()
	if err != nil {
		log.Fatal(err)
	}
	fmt.Printf(&quot;%s\n&quot;, websearch)
}

答案2

得分: 2

缺失的方案让浏览器选择协议,对于同时提供http和https的网站非常方便。然后浏览器根据访问页面的方式选择使用哪种协议。您可以将https或http作为默认值,或者像浏览器一样选择用于获取页面的协议。

例如,像这样的代码:

for _, parsedLink := range parsedLinks {
    parsedLink.Scheme = requestUrl.Scheme
}
英文:

The missing scheme lets the browser choose the protocol and is handy for sites which offer both
http and https. The browser then chooses which protocol to use depending on how he got to the page.
You can use https or http as a default or act like a browser and choose the protocol you used to
fetch the page.

For example, something like this:

for _, parsedLink := range parsedLinks {
    parsedLink.Scheme = requestUrl.Scheme
}

huangapple
  • 本文由 发表于 2013年4月22日 01:58:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/16134392.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定