英文:
Identify double byte character in a string and convert that into a single byte character
问题
在我的Go项目中,我正在处理亚洲语言,并且存在双字节字符。在我的情况下,我有一个包含两个单词并且它们之间有一个空格的字符串。
例如:こんにちは 世界
现在我需要检查这个空格是否是双字节空格,如果是的话,我需要将其转换为单字节空格。
我已经搜索了很多,但是我找不到一种方法来做到这一点。由于我无法找到一种方法来做到这一点,很抱歉我没有代码示例可以提供。
我需要遍历每个字符并使用其代码选择双字节空格并替换吗?我应该使用什么代码来识别双字节空格?
英文:
In my Go project, I am dealing with asian languages and There are double byte characters. In my case, I have a string which contains two words and there is a space between them.
EG: こんにちは 世界
Now I need to check if that space is a double byte space and if so, I need to convert that into single byte space.
I have searched a lot, but I couldn't find a way to do this. Since I cannot figure out a way to do this, sorry I have no code sample to add here.
Do I need to loop through each character and pick the double byte space using its code and replace? What is the code I should use to identify double byte space?
答案1
得分: 2
只需替换?
package main
import (
"fmt"
"strings"
)
func main() {
fmt.Println(strings.Replace("こんにちは 世界", " ", " ", -1))
}
请注意,Replace
函数的第二个参数是
,与您在示例中的字符串一样。此替换函数将查找原始字符串中与之匹配的所有 rune,并将其替换为 ASCII 空格
。
英文:
Just replace?
package main
import (
"fmt"
"strings"
)
func main() {
fmt.Println(strings.Replace("こんにちは 世界", " ", " ", -1))
}
Notice that the second argument in Replace
is
, as copy-paste from your string in example. This replace function will find all rune that match that in the original string and replace it with ASCII space
答案2
得分: 2
在Go语言中,没有像双字节字符那样的概念。有一种特殊类型叫做rune
,在底层是int32
类型,用于表示Unicode字符。
特殊空格的Unicode码是12288
,普通空格的Unicode码是32
。
要遍历字符,可以使用range
关键字:
for _, char := range chars {...} // char是rune类型
要替换这个字符,可以使用strings.Replace
或strings.Map
,并定义一个用于替换不需要的字符的函数。
func converter(r rune) rune {
if r == 12288 {
return 32
}
return r
}
result := strings.Map(converter, "こんにちは 世界")
也可以使用字符字面值代替数字:
if r == ' ' {
return ' '
}
英文:
In golang there is nothing like double byte character. There is special type rune
which is int32
under hood and rune is unicode representation.
your special space is 12288
and normal space is 32
unicode.
To iterate over characters you can use range
for _, char := range chars {...} // char is rune type
To replace this character you can use strings.Replace
or strings.Map
and define function for replacement of unwanted characters.
func converter(r rune) rune {
if r == 12288 {
return 32
}
return r
}
result := strings.Map(converter, "こんにちは 世界")
It is also posible to use characters literals instead of numbers
if r == ' ' {
return ' '
}
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论