为什么这两个字符串不相等?

huangapple go评论97阅读模式
英文:

Golang Why aren't these two strings equal?

问题

我复制并粘贴了这两个字符串(一个来自Google文档,一个来自终端)-到底发生了什么?我该如何清理它们,使它们相同?

package main

import "fmt"

func main() {
    fmt.Println([]byte("f6f77482e4394a21815b7090bc0185b4"))
    fmt.Println([]byte("f6f77482­e439­4a21­815b­7090bc0185b4"))
}

返回结果:

[102 54 102 55 55 52 56 50 101 52 51 57 52 97 50 49 56 49 53 98 55 48 57 48 98 99 48 49 56 53 98 52]
[102 54 102 55 55 52 56 50 194 173 101 52 51 57 194 173 52 97 50 49 194 173 56 49 53 98 194 173 55 48 57 48 98 99 48 49 56 53 98 52]

很明显,这是两个不同的字节数组,表示相同的字符串。

链接:https://play.golang.org/p/_zd7tjqCZl

英文:

I copied and pasted these two strings (one from a Google Doc and one from terminal) – what the heck is going on? And how can I clean them up so they are the same?

package main

import "fmt"

func main() {
	fmt.Println([]byte("f6f77482e4394a21815b7090bc0185b4"))
	fmt.Println([]byte("f6f77482­e439­4a21­815b­7090bc0185b4"))
}

Returns:

[102 54 102 55 55 52 56 50 101 52 51 57 52 97 50 49 56 49 53 98 55 48 57 48 98 99 48 49 56 53 98 52]
[102 54 102 55 55 52 56 50 194 173 101 52 51 57 194 173 52 97 50 49 194 173 56 49 53 98 194 173 55 48 57 48 98 99 48 49 56 53 98 52]

Which are clearly two different byte arrays for the same string.

https://play.golang.org/p/_zd7tjqCZl

答案1

得分: 15

第二个字符串中的可见字符之间有多个“软连字符”(U+00AD)字符,第一个出现在“482”和“e4”之间。软连字符是一种在不是换行位置时不可见的字符,但如果它恰好位于换行位置,它会显示为连字符。你是从文字处理器或其他可能对其应用了特殊文本格式的程序中复制粘贴代码的吗?

英文:

The second one has a number of "soft hyphen" (U+00AD) characters between the visible characters, the first one appearing between "482" and "e4". A soft hyphen is a character that's invisible unless it happens to be at the location of a line-break, then it appears as a hyphen. Did you copy-paste the code from a word processor or some other program that might have applied special text formatting to it?

答案2

得分: 6

问题在于第二个字符串中有4个Unicode软连字符0+00ad,在playground上无法打印出来。

实际上,你所做的基本上类似于...

fmt.Println([]byte("f6f77482e4394a21815b7090bc0185b4"))
fmt.Println([]byte("f6f77482­-e439­-4a21-­815b­-7090bc0185b4"))

这是在vim中粘贴的样子

为什么这两个字符串不相等?

英文:

The problem is that the second one has 4 Unicode Soft Hyphens 0+00ad which are not printable on the playground.

What you are actually doing is basically similar to...

fmt.Println([]byte("f6f77482e4394a21815b7090bc0185b4"))
fmt.Println([]byte("f6f77482­-e439­-4a21-­815b­-7090bc0185b4"))

This is what is looks like pasted into vim

为什么这两个字符串不相等?

huangapple
  • 本文由 发表于 2016年3月12日 11:34:14
  • 转载请务必保留本文链接:https://go.coder-hub.com/35953436.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定