英文:
Why does Java StringLatin1.regionMatchesCI method perform toUpperCase() and than toLowerCase() when comparing chars?
问题
I was looking into String.equalsIgnoreCase
method and found that at the end it invokes StringLatin1.regionMatchesCI
method.
However, the code of this method seems strange to me, here it is:
public static boolean regionMatchesCI(byte[] value, int toffset,
byte[] other, int ooffset, int len) {
int last = toffset + len;
while (toffset < last) {
char c1 = (char)(value[toffset++] & 0xff);
char c2 = (char)(other[ooffset++] & 0xff);
if (c1 == c2) {
continue;
}
char u1 = Character.toUpperCase(c1);
char u2 = Character.toUpperCase(c2);
if (u1 == u2) {
continue;
}
if (Character.toLowerCase(u1) == Character.toLowerCase(u2)) {
continue;
}
return false;
}
return true;
}
Why check the upperCase and then lowerCase? Wouldn't the lowercase always fail if the uppercase check doesn't match? Am I missing something?
英文:
I was looking into String.euqalsIgnoreCase
method and found that at the end it invokes StringLatin1.regionMatchesCI
method.
However, the code of this method seems strange to me, here it is:
public static boolean regionMatchesCI(byte[] value, int toffset,
byte[] other, int ooffset, int len) {
int last = toffset + len;
while (toffset < last) {
char c1 = (char)(value[toffset++] & 0xff);
char c2 = (char)(other[ooffset++] & 0xff);
if (c1 == c2) {
continue;
}
char u1 = Character.toUpperCase(c1);
char u2 = Character.toUpperCase(c2);
if (u1 == u2) {
continue;
}
if (Character.toLowerCase(u1) == Character.toLowerCase(u2)) {
continue;
}
return false;
}
return true;
}
Why check the upperCase and than lowerCase? Wouldn't the lower cases always fail in case the upper check doesn't match? Am I missing something?
答案1
得分: 4
在我找到的源代码中(在谷歌的某个地方),对于这个函数,我有额外的解释:
// 尝试将两个字符都转换为大写。
// 如果结果匹配,则比较扫描应该继续。
char u1 = Character.toUpperCase(c1);
char u2 = Character.toUpperCase(c2);
if (u1 == u2) {
continue;
}
// 不幸的是,将字符转换为大写不适用于格鲁吉亚字母,因为它有关于大小写转换的奇怪规则。因此,在退出之前,我们需要进行最后一次检查。
if (Character.toLowerCase(u1) == Character.toLowerCase(u2)) {
continue;
}
所以看起来有一些变通方法。在GitHub上,您可能会找到更多不同的此函数实现。
英文:
In the source code I found (somewhere on google) for this function I have additional explanation:
// try converting both characters to uppercase.
// If the results match, then the comparison scan should
// continue.
char u1 = Character.toUpperCase(c1);
char u2 = Character.toUpperCase(c2);
if (u1 == u2) {
continue;
}
// Unfortunately, conversion to uppercase does not work properly
// for the Georgian alphabet, which has strange rules about case
// conversion. So we need to make one last check before
// exiting.
if (Character.toLowerCase(u1) == Character.toLowerCase(u2)) {
continue;
}
So it looks like some workarounds. On github you might find even more different implementations of this function.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论