英文:
Extra escape character in go URL
问题
我有以下代码片段:
u := *baseURL
u.User = nil
if q := strings.Index(path, "?"); q > 0 {
u.Path = path[:q]
u.RawQuery = path[q+1:]
} else {
u.Path = path
}
log.Printf(" url %v, u.String())
我注意到当baseurl设置为类似于http://localhost:9000/buckets/test%?bucket_uuid=7864b0dcdf0a578bd0012c70aef58aca的内容时,url包似乎在%符号附近添加了一个额外的转义字符。例如,上述打印语句的输出如下:
2015/03/25 12:02:49 url http://localhost:9000/pools/default/buckets/test%2525?bucket_uuid=7864b0dcdf0a578bd0012c70aef58aca
只有在设置URL的RawQuery字段时才会发生这种情况。有任何想法为什么会这样?我正在使用Go版本1.3.3。
谢谢,
Manik
英文:
I have the following snippet of code :
u := *baseURL
u.User = nil
if q := strings.Index(path, "?"); q > 0 {
u.Path = path[:q]
u.RawQuery = path[q+1:]
} else {
u.Path = path
}
log.Printf(" url %v, u.String())
I see that when the baseurl is set to something like this http://localhost:9000/buckets/test%?bucket_uuid=7864b0dcdf0a578bd0012c70aef58aca the url package seems to add an extra escape character near the % sign. For e.g. the output of the above print statement is the following :
2015/03/25 12:02:49 url http://localhost:9000/pools/default/buckets/test%2525?bucket_uuid=7864b0dcdf0a578bd0012c70aef58aca
This seems to only happen when the RawQuery field of the URL is set. Any idea why this is happening ? I'm using go version 1.3.3
Cheers,
Manik
答案1
得分: 5
URL只能包含ASCII字符集中的字符,但通常希望包含/传输ASCII字符集之外的字符。在这种情况下,URL必须转换为有效的ASCII格式。
如果原始URL包含不允许的字符,它们将被转义:用'%'
替换,并在其后跟两个十六进制数字。因此,字符'%'
是特殊字符,也必须进行转义(其转义形式也以'%'
开头,其十六进制代码为25
)。
由于您的原始URL包含字符'%'
,它将被替换为"%25"
。
回到您的示例:在打印形式中,您看到的是"%2525"
。您可能会问为什么不只是"%25"
?
这是因为您的原始URL包含一个'%'
的转义形式,这意味着其原始形式包含转义序列"%25"
。如果您将其用作原始输入进行使用/解释,'%'
将被"%25"
替换,后面将跟随输入中的"25"
,因此结果为"%2525"
。
参见:HTML URL编码参考
还有:RFC 3986 - 统一资源标识符(URI):通用语法
英文:
URLs may only contain characters of the ASCII character set, but it is often intended to include/transfer characters outside of this ASCII set. In such cases the URL has to be converted into a valid ASCII format.
If the raw URL contains characters outside of the allowed set, they are escaped: they are replaced with a '%'
followed by two hexadecimal digits. Therefore the character '%'
is special and also has to be escaped (and its escaped form will start with '%'
as well, and its hexadecimal code is 25
).
Since your raw URL contains the character '%'
, it will be replaced by "%25"
.
Back to your example: in the printed form you see "%2525"
. You could ask why not just "%25"
?
This is because your original url contains a '%'
in its escaped form which means its raw form contains the escape sequence "%25"
. If you use/interpret this as raw input, the '%'
will be replaced by "%25"
which will be followed by the "25"
from the input hence resulting in "%2525"
.
See: HTML URL Encoding Reference
Also: RFC 1738 - Uniform Resource Locators (URL)
And also: RFC 3986 - Uniform Resource Identifier (URI): Generic Syntax
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论