英文:
Sorting strings with Turkish alphabetical order in Dart
问题
我想在Dart中对土耳其国名进行排序,然而我找不到一种可以在排序过程中自定义字母顺序的方法,就像Java中的collations。
例如,当我使用标准的字符串compareTo
方法进行排序时,我得到:
var countries = ["Belgrad", "Belçika", "Irak", "İran", "Sudan", "Şili", "Çek Cumhuriyeti", "Cezayir", "Ukrayna", "Ürdün"];
countries.sort((firstString, secondString) => firstString.compareTo(secondString));
print(countries);
它以错误的顺序打印,基于英文字母顺序:
[Belgrad, Belçika, Cezayir, Irak, Sudan, Ukrayna, Çek Cumhuriyeti, Ürdün, İran, Şili]
因为在土耳其字母表中,字母的顺序是:"a", "b", "c", "ç", "d", "e", "f", "g", "ğ", "h", "ı", "i", "j", "k", "l", "m", "n", "o", "ö", "p", "r", "s", "ş", "t", "u", "ü", "v", "y", "z",正确的顺序应该是:
(请注意,Belçika必须出现在Belgrad之前,因为ç < g在土耳其语中)
[Belçika, Belgrad, Cezayir, Çek Cumhuriyeti, Irak, İran, Sudan, Şili, Ukrayna, Ürdün]
英文:
I want to sort the Turkish name of countries in Dart, however I couldn't find a method to customize the alphabetical order during sorting operation like collations in Java.
For example when I sort with a standard string compareTo
method I get:
var countries = ["Belgrad", "Belçika", "Irak", "İran", "Sudan", "Şili", "Çek Cumhuriyeti", "Cezayir", "Ukrayna", "Ürdün"];
countries.sort((firstString, secondString) => firstString.compareTo(secondString));
print(countries);
It prints in a wrong order based on English alphabetical order:
[Belgrad, Belçika, Cezayir, Irak, Sudan, Ukrayna, Çek Cumhuriyeti, Ürdün, İran, Şili]
Because in Turkish alphabet the order of letters is "a", "b", "c", "ç", "d", "e", "f", "g", "ğ", "h", "ı", "i", "j", "k", "l", "m", "n", "o", "ö", "p", "r", "s", "ş", "t", "u", "ü", "v", "y", "z", the correct order should be like:
(note that Belçika must come before Belgrad because ç < g in Turkish)
[Belçika, Belgrad, Cezayir, Çek Cumhuriyeti, Irak, İran, Sudan, Şili, Ukrayna, Ürdün]
答案1
得分: 3
你可以定义你想要的排序方法。如果我以你的示例为例,你可以这样做:
void main() {
const List<String> turkishOrder = ["a", "b", "c", "ç", "d", "e", "f", "g", "ğ", "h", "ı", "i", "i̇", "j", "k", "l", "m", "n", "o", "ö", "p", "r", "s", "ş", "t", "u", "ü", "v", "y", "z"];
List<String> countries = ["Irak", "İran", "Sudan", "Şili", "Çek Cumhuriyeti", "Cezayir", "Ukrayna", "Ürdün"];
countries.sort((String a, String b) => turkishOrder.indexOf(a[0].toLowerCase()).compareTo(turkishOrder.indexOf(b[0].toLowerCase())));
print(countries);
}
我在这里做了以下操作:
- 我在一个列表中定义了土耳其的排序顺序,
- 然后为了比较两个国家,我提取了第一个字母(a[0]),并将其转换为小写,因为我的turkishOrder数组只包含小写字母
- 我获取了这个小写第一个字母的索引,以便与第二个国家进行比较。
备注:
-
我没有检查countries列表中的国家是否为空,如果是空的话,
a[0]
或b[0]
会引发异常,不要忘记这一点。 -
我还必须在turkishOrder列表中添加 "i̇",以便将İran放在正确的位置。
编辑:上面的解决方案只比较了第一个字母。这是比较所有字母的更新版本的比较函数:
countries.sort((String a, String b) {
int index = 0;
while (index < a.length && index < b.length) {
final int comparison = turkishOrder.indexOf(a[index].toLowerCase()).compareTo(turkishOrder.indexOf(b[index].toLowerCase()));
if (comparison != 0) {
// -1或+1表示字母不同,因此找到了顺序
return comparison;
} // 0表示字母相等,进入下一个字母
index++;
}
return 0;
});
英文:
You can define what sorting method you want. If I take your example, you can do something like:
void main() {
const List<String> turkishOrder = ["a", "b", "c", "ç", "d", "e", "f", "g", "ğ", "h", "ı", "i", "i̇", "j", "k", "l", "m", "n", "o", "ö", "p", "r", "s", "ş", "t", "u", "ü", "v", "y", "z"];
List<String> countries = ["Irak", "İran", "Sudan", "Şili", "Çek Cumhuriyeti", "Cezayir", "Ukrayna", "Ürdün"];
countries.sort((String a, String b) => turkishOrder.indexOf(a[0].toLowerCase()).compareTo(turkishOrder.indexOf(b[0].toLowerCase())));
print(countries);
}
What I did here:
- I define the Turkish order in a List,
- then to compare 2 countries, I retrieve the first letter (a[0]), to lower case since my turkishOrder array contains only lowercase letters
- I retrieve the index of this lowercase first letter, to compare to the second country.
Notes:
-
I did not check if the country in countries List is not empty,
a[0]
orb[0]
would throw an exception in this case, don't forget it. -
I also had to add "i̇" in the turkishOrder List, in order to have İran at the right place.
Edit: the solution above only compares the first letter. Here is an updated version of the comparison function, to compare all letters:
countries.sort((String a, String b) {
int index = 0;
while (index < a.length && index < b.length) {
final int comparison =
turkishOrder.indexOf(a[index].toLowerCase()).compareTo(turkishOrder.indexOf(b[index].toLowerCase()));
if (comparison != 0) {
// -1 or +1 means that the letters are different, thus an order is found
return comparison;
} // 0 means that the letters are equal, go to next letter
index++;
}
return 0;
});
答案2
得分: 1
尽管Lucie的答案有效,但我最终为泛用的土耳其字母操作编写了自己的字符串扩展。土耳其字母表中的大写字母"I"存在奇怪的问题。它的小写形式是"ı"(不是"i")。此外,小写字母"i"的大写形式是"İ"。Dart的toLowerCase
函数不支持土耳其语中的这个小细节。
使用以下片段可以对带有土耳其字符的字符串进行排序(基于Daniel的答案):
extension TurkishStringOperations on String {
String toTurkishLowerCase() {
return replaceAll("İ", "i")
.replaceAll("Ş", "ş")
.replaceAll("Ç", "ç")
.replaceAll("Ğ", "ğ")
.replaceAll("I", "ı")
.replaceAll("Ü", "ü")
.toLowerCase();
}
String toTurkishUpperCase() {
return replaceAll("i", "İ")
.replaceAll("ş", "Ş")
.replaceAll("ç", "Ç")
.replaceAll("ğ", "Ğ")
.replaceAll("ı", "I")
.replaceAll("ü", "Ü")
.toUpperCase();
}
int turkishCompareTo(String other) {
var letters = [
"a", "b", "c", "ç", "d", "e", "f", "g", "ğ", "h", "ı", "i", "j", "k",
"l", "m", "n", "o", "ö", "p", "r", "s", "ş", "t", "u", "ü", "v", "y",
"z", "w", "q", "x",
];
var that = toTurkishLowerCase();
other = other.toTurkishLowerCase();
for (var i = 0; i < that.length && i < other.length; i++) {
var thatValue = letters.indexOf(that[i]);
var otherValue = letters.indexOf(other[i]);
var result = (thatValue - otherValue).sign;
if (result != 0) {
return result;
}
}
return (that.length - other.length).sign;
}
}
以下示例列表使用列表的标准sort()
方法进行排序,使用了我们的扩展方法:
var countries = ["Belgrad", "Belçika", "Irak", "İran", "Sudan", "Şili", "Çek Cumhuriyeti", "Cezayir", "Ukrayna", "Ürdün"];
countries.sort((firstString, secondString) => firstString.compareTo(secondString));
print(countries); // 以错误的顺序打印:[Belgrad, Belçika, Cezayir, Irak, Sudan, Ukrayna, Çek Cumhuriyeti, Ürdün, İran, Şili]
countries.sort((firstString, secondString) => firstString.turkishCompareTo(secondString));
print(countries); // 以正确的土耳其顺序打印:[Belçika, Belgrad, Cezayir, Çek Cumhuriyeti, Irak, İran, Sudan, Şili, Ukrayna, Ürdün]
英文:
Although, Lucie's answer works, I ended up writing my own extension for strings for Turkish alphabet operations for the general use. The Turkish alphabet has a weird issue with capital letter "I". Its lowercase form is "ı" (not "i"). Also, the lowercase letter "i" has an uppercase form as "İ". Dart does not support this minor detail for the Turkish in its toLowerCase
function.
Sorting strings with Turkish characters can be done with the snippet below. (based on Daniel's answer):
extension TurkishStringOperations on String {
String toTurkishLowerCase() {
return replaceAll("İ", "i")
.replaceAll("Ş", "ş")
.replaceAll("Ç", "ç")
.replaceAll("Ö", "ö")
.replaceAll("I", "ı")
.replaceAll("Ü", "ü")
.replaceAll("Ğ", "ğ")
.toLowerCase();
}
String toTurkishUpperCase() {
return replaceAll("i", "İ")
.replaceAll("ş", "Ş")
.replaceAll("ç", "Ç")
.replaceAll("ö", "Ö")
.replaceAll("ı", "I")
.replaceAll("ü", "Ü")
.replaceAll("ğ", "Ğ")
.toUpperCase();
}
int turkishCompareTo(String other) {
var letters = [
"a", "b", "c", "ç", "d", "e", "f", "g", "ğ", "h", "ı", "i", "j", "k",
"l", "m", "n", "o", "ö", "p", "r", "s", "ş", "t", "u", "ü", "v", "y",
"z", "w", "q", "x",
];
var that = toTurkishLowerCase();
other = other.toTurkishLowerCase();
for (var i = 0; i < min(that.length, other.length); i++) {
var thatValue = letters.indexOf(that[i]);
var otherValue = letters.indexOf(other[i]);
var result = (thatValue - otherValue).sign;
if (result != 0) {
return result;
}
}
return (that.length - other.length).sign;
}
}
The example list below is sorted with a standard sort()
method of list using our extension method:
var countries = ["Belgrad", "Belçika", "Irak", "İran", "Sudan", "Şili", "Çek Cumhuriyeti", "Cezayir", "Ukrayna", "Ürdün"];
countries.sort((firstString, secondString) => firstString.compareTo(secondString));
print(countries); // prints in a wrong order: [Belgrad, Belçika, Cezayir, Irak, Sudan, Ukrayna, Çek Cumhuriyeti, Ürdün, İran, Şili]
countries.sort((firstString, secondString) => firstString.turkishCompareTo(secondString));
print(countries); // prints in a correct Turkish order: [Belçika, Belgrad, Cezayir, Çek Cumhuriyeti, Irak, İran, Sudan, Şili, Ukrayna, Ürdün]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论