英文:
Extra escape character in go URL
问题
我有以下代码片段:
u := *baseURL
u.User = nil
if q := strings.Index(path, "?"); q > 0 {
u.Path = path[:q]
u.RawQuery = path[q+1:]
} else {
u.Path = path
}
log.Printf(" url %v, u.String())
我注意到当baseurl设置为类似于http://localhost:9000/buckets/test%?bucket_uuid=7864b0dcdf0a578bd0012c70aef58aca的内容时,url包似乎在%符号附近添加了一个额外的转义字符。例如,上述打印语句的输出如下:
2015/03/25 12:02:49 url http://localhost:9000/pools/default/buckets/test%2525?bucket_uuid=7864b0dcdf0a578bd0012c70aef58aca
只有在设置URL的RawQuery字段时才会发生这种情况。有任何想法为什么会这样?我正在使用Go版本1.3.3。
谢谢,
Manik
英文:
I have the following snippet of code :
u := *baseURL
u.User = nil
if q := strings.Index(path, "?"); q > 0 {
u.Path = path[:q]
u.RawQuery = path[q+1:]
} else {
u.Path = path
}
log.Printf(" url %v, u.String())
I see that when the baseurl is set to something like this http://localhost:9000/buckets/test%?bucket_uuid=7864b0dcdf0a578bd0012c70aef58aca the url package seems to add an extra escape character near the % sign. For e.g. the output of the above print statement is the following :
2015/03/25 12:02:49 url http://localhost:9000/pools/default/buckets/test%2525?bucket_uuid=7864b0dcdf0a578bd0012c70aef58aca
This seems to only happen when the RawQuery field of the URL is set. Any idea why this is happening ? I'm using go version 1.3.3
Cheers,
Manik
答案1
得分: 5
URL只能包含ASCII字符集中的字符,但通常希望包含/传输ASCII字符集之外的字符。在这种情况下,URL必须转换为有效的ASCII格式。
如果原始URL包含不允许的字符,它们将被转义:用'%'替换,并在其后跟两个十六进制数字。因此,字符'%'是特殊字符,也必须进行转义(其转义形式也以'%'开头,其十六进制代码为25)。
由于您的原始URL包含字符'%',它将被替换为"%25"。
回到您的示例:在打印形式中,您看到的是"%2525"。您可能会问为什么不只是"%25"?
这是因为您的原始URL包含一个'%'的转义形式,这意味着其原始形式包含转义序列"%25"。如果您将其用作原始输入进行使用/解释,'%'将被"%25"替换,后面将跟随输入中的"25",因此结果为"%2525"。
参见:HTML URL编码参考
还有:RFC 3986 - 统一资源标识符(URI):通用语法
英文:
URLs may only contain characters of the ASCII character set, but it is often intended to include/transfer characters outside of this ASCII set. In such cases the URL has to be converted into a valid ASCII format.
If the raw URL contains characters outside of the allowed set, they are escaped: they are replaced with a '%' followed by two hexadecimal digits. Therefore the character '%' is special and also has to be escaped (and its escaped form will start with '%' as well, and its hexadecimal code is 25).
Since your raw URL contains the character '%', it will be replaced by "%25".
Back to your example: in the printed form you see "%2525". You could ask why not just "%25"?
This is because your original url contains a '%' in its escaped form which means its raw form contains the escape sequence "%25". If you use/interpret this as raw input, the '%' will be replaced by "%25" which will be followed by the "25" from the input hence resulting in "%2525".
See: HTML URL Encoding Reference
Also: RFC 1738 - Uniform Resource Locators (URL)
And also: RFC 3986 - Uniform Resource Identifier (URI): Generic Syntax
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论