How can I translate this IDNA URL to Unicode?

huangapple go评论132阅读模式
英文:

How can I translate this IDNA URL to Unicode?

问题

我想将一个IDNA ASCII URL转换为Unicode。

  1. package main
  2. import (
  3. "golang.org/x/net/idna"
  4. "log"
  5. )
  6. func main() {
  7. input := "https://xn---36-mddtcafmzdgfgpbxs0h7c.xn--p1ai"
  8. idnaProfile := idna.New()
  9. output, err := idnaProfile.ToUnicode(input)
  10. if err != nil {
  11. log.Fatal(err)
  12. }
  13. log.Printf("%s", output)
  14. }

输出结果是:https://xn---36-mddtcafmzdgfgpbxs0h7c.рф

看起来IDNA包只转换了顶级域名。是否有一些选项可以转换整个URL?

我需要获得与将ASCII URL粘贴到Chrome中时相同的结果:
https://природный-источник36.рф

英文:

I want to translate an IDNA ASCII URL to Unicode.

  1. package main
  2. import (
  3. "golang.org/x/net/idna"
  4. "log"
  5. )
  6. func main() {
  7. input := "https://xn---36-mddtcafmzdgfgpbxs0h7c.xn--p1ai"
  8. idnaProfile := idna.New()
  9. output, err := idnaProfile.ToUnicode(input)
  10. if err != nil {
  11. log.Fatal(err)
  12. }
  13. log.Printf("%s", output)
  14. }

The output is: https://xn---36-mddtcafmzdgfgpbxs0h7c.рф

It seems the IDNA package only converts the TLD. Is there some option that can convert the full URL?

I need to get the same result as when I paste the ASCII URL into Chrome:
https://природный-источник36.рф

答案1

得分: 1

你只需要首先解析URL:

  1. package main
  2. import (
  3. "golang.org/x/net/idna"
  4. "net/url"
  5. )
  6. func main() {
  7. p, e := url.Parse("https://xn---36-mddtcafmzdgfgpbxs0h7c.xn--p1ai")
  8. if e != nil {
  9. panic(e)
  10. }
  11. s, e := idna.ToUnicode(p.Host)
  12. if e != nil {
  13. panic(e)
  14. }
  15. println(s == "природный-источник36.рф")
  16. }

https://golang.org/pkg/net/url#Parse

英文:

You simply need to parse the URL first:

  1. package main
  2. import (
  3. "golang.org/x/net/idna"
  4. "net/url"
  5. )
  6. func main() {
  7. p, e := url.Parse("https://xn---36-mddtcafmzdgfgpbxs0h7c.xn--p1ai")
  8. if e != nil {
  9. panic(e)
  10. }
  11. s, e := idna.ToUnicode(p.Host)
  12. if e != nil {
  13. panic(e)
  14. }
  15. println(s == "природный-источник36.рф")
  16. }

https://golang.org/pkg/net/url#Parse

答案2

得分: -1

一个IDNA字符串由用点号“.”分隔的“标签”组成。每个标签可以被编码(如果以“xn--”开头)或者不被编码(如果不是以“xn--”开头)。你的字符串由两个标签组成,https://xn---36-mddtcafmzdgfgpbxs0h7cxn--p1ai。只有第二个标签是IDNA编码的。

只处理那些被IDNA编码的URL部分(即主机名)。其他任何内容都是无意义的,无法工作。

英文:

An IDNA string consists of "labels" separated by dots ".". Each label may be encoded (if it starts with "xn--") or not (if it doesn't). Your string consists of two labels, https://xn---36-mddtcafmzdgfgpbxs0h7c and xn--p1ai. Only the second one is IDNA encoded.

Just process those parts of the URL which are IDNA encoded (i.e. the hostname). Anything else is just nonsensical and cannot work.

huangapple
  • 本文由 发表于 2021年5月26日 17:11:38
  • 转载请务必保留本文链接:https://go.coder-hub.com/67701899.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定