How do you distinguish filenames when reading/writing unicode filenames?

huangapple go评论68阅读模式
英文:

How do you distinguish filenames when reading/writing unicode filenames?

问题

我正在编写go代码,但我不认为它只适用于go,所以让我们将其泛化。想象一下,通过go代码,用户创建了三个具有不同Unicode名称的文件。请注意,文件名的最后一个字母是不同的。

  • καθέδρα.txt
  • καθέδρᾳ.txt
  • καθέδραι.txt

go中,这三个字符串是三个不同的唯一字符串。看起来,如果您尝试使用这三个名称创建三个文件,您最终只会保存两个文件到磁盘。第二个和第三个文件名似乎被视为相同的文件。因此,当脚本写入三个用户创建的文件时,一个文件会“丢失”。

如果您先写入καθέδρᾳ.txt,然后写入καθέδραι.txt,您最终只会得到第一个文件名。

如果您先写入καθέδραι.txt,然后写入καθέδρᾳ.txt,您最终只会得到第一个文件名。

在golang中,如何防止Unicode中奇怪的OS/X文件名行为?它似乎将两个不同的字符串视为一个文件名。

英文:

I am writing go code but I don't believe its unique to go so lets generalize it. Imagine a user via go code creates three files with three distinct unicode names. Notice the last letters of the filename are different.

  • καθέδρα.txt
  • καθέδρᾳ.txt
  • καθέδραι.txt

In go, these three strings are three different unique strings. It appears, that if you try to create three files with these three names, you end up with two files saved to disk. The second and third filenames appear to be treated as identical files. So when the script writes three user created files, one goes "missing".

If you write καθέδρᾳ.txt then καθέδραι.txt you end up with only the first filename.

If you write καθέδραι.txt then καθέδρᾳ.txt you end up with only the first filename.

How do you guard in golang against strange OS/X filename behavior in unicode? It appears to think two different strings are one filename.

答案1

得分: 0

当您在OS/X上选择不区分大小写的文件系统时,大小写不敏感的处理过程比我们的直觉所期望的要复杂。根据语言的不同,规则也不同。

  • 在英语中,大写的a是A。
  • 在某些语言中,大写的I是İ。
  • 显然,ᾳ等同于αι。

除了检测文件系统类型之外,没有真正的方法来防止这种情况。

跨平台解决这个问题的方法是让您的软件使用不同的“大小写”写入文件并读取它,以检测问题是否存在。

英文:

When you choose a case insensitive file system on OS/X, the case insensitivity process is more complex than our intuition would expect. Depending on the language, the rules are different.

  • Uppercase a is A (in English).
  • Uppercase of I is İ some languages.
  • Apparently ᾳ equates to αι.

There is no real way to guard against this except to detect the file system type.

The cross platform way to prevent the problem would be to have your software write a file and read it back using a different "case" to detect if the problem exists.

huangapple
  • 本文由 发表于 2023年4月7日 12:13:07
  • 转载请务必保留本文链接:https://go.coder-hub.com/75955420.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定