英文:
Re-sort string based on precedence of separator
问题
I have a string with a certain meaning for example "a,b;X1"
or e&r1
. In total there are 3 possible separators between values: ,;&
where ;
has low precedence.
Also "a,b;X1"
and "b,a;X1"
are the same but to be able to compare they are the same, I want to predictably resort the string so that indeed the 2 can be compared to be equal. In essence "b,a;X1"
must be "sorted" to become "a,b;X1"
and this is rather a simple example. The expression can be more complex.
The precedence is of importance as "a,b;X1"
is not the same as "a;b,X1"
.
In general I would need to split into "groups by precedence and then sort the groups and merge things together again but unclear how to achieve this.
So far I have:
example = "b,a;X1"
ls = example.split(';')
ls2 = [x.split(",") for x in ls]
ls3 = [[y.split("&") for y in x] for x in ls2]
ls3.sort()
print(ls3)
# [[['X1']], [['b'], ['a']]]
Sorting doesn't yet work as a should be before b, and then I'm not sure how to "stitch" the result back together again.
For clarification:
,
means OR&
means AND (high precedence);
means AND (low precedence)
"a,b;X1"
therefore means (a OR b) AND X1
"b,a;X1"
therefore means (b OR a) AND X1 i.e. the same
英文:
I have a string with a certain meaning for example "a,b;X1"
or e&r1
. In total there are 3 possible separators between values: ,;&
where ;
has low precedence.
Also "a,b;X1"
and "b,a;X1"
are the same but to be able to compare they are the same, I want to predictably resort the string so that indeed the 2 can be compared to be equal. In essence "b,a;X1"
must be "sorted" to become "a,b;X1"
and this is rather a simple example. The expression can be more complex.
The precedence is of importance as "a,b;X1"
is not the same as "a;b,X1"
.
In general I would need to split into "groups by precedence and then sort the groups and merge things together again but unclear how to achieve this.
So far I have:
example = "b,a;X1"
ls = example.split(';')
ls2 = [x.split(",") for x in ls]
ls3 = [[y.split("&") for y in x] for x in ls2]
ls3.sort()
print(ls3)
# [[['X1']], [['b'], ['a']]]
Sorting doesn't yet work as a should be before b and then I'm not sure how to "stitch" the result back together again.
For clarification:
,
means OR&
means AND (high precedence);
means AND (low precedence)
"a,b;X1"
therefore means (a OR b) AND X1
"b,a;X1"
therefore means (b OR a) AND X1 i.e. the same
答案1
得分: 2
你可以使用 split
、sort
(与 join
结合使用),但它应该在每个操作符的每个级别上发生:
def normalize(s):
return "&".join(sorted(
",".join(sorted(
"&".join(sorted(factor.split("&")))
for factor in term.split(",")
))
for term in s.split(";")
))
example = "b,a&z&x;x1;m&f,q&c"
print(normalize(example)) # a&x&z,b;c&q,f&m;x1
英文:
You could use split
, sort
as you did (combined with join
) , but it should happen at every level of operator:
def normalize(s):
return ";".join(sorted(
",".join(sorted(
"&".join(sorted(factor.split("&")))
for factor in term.split(",")
))
for term in s.split(";")
))
example = "b,a&z&x;x1;m&f,q&c"
print(normalize(example)) # a&x&z,b;c&q,f&m;x1
答案2
得分: 1
我建议编写一个递归排序每个列表的函数。以下是如何实现的示例:
delim_precedence = (';', ',', '&')
def recursive_split(s, delim):
if isinstance(s, str):
return s.split(delim)
elif isinstance(s, list):
return [recursive_split(i, delim) for i in s]
else:
raise Exception("未知类型")
def split_by_precedence(s):
for delim in delim_precedence:
s = recursive_split(s, delim)
return s
def recursive_sort(s):
if isinstance(s, str):
return s
elif isinstance(s, list):
return sorted([recursive_sort(i) for i in s])
else:
raise Exception("未知类型")
def rejoin(s, delims=delim_precedence):
if len(delims) == 0:
return s
return delims[0].join(rejoin(i, delims[1:]) for i in s)
def canonicalize(s):
return rejoin(recursive_sort(split_by_precedence(s)))
print(canonicalize(example))
英文:
I would suggest writing a function which recursively sorts each list. Here's an example of how to do that:
delim_precedence = (';', ',', '&')
def recursive_split(s, delim):
if isinstance(s, str):
return s.split(delim)
elif isinstance(s, list):
return [recursive_split(i, delim) for i in s]
else:
raise Exception("unknown type")
def split_by_precedence(s):
for delim in delim_precedence:
s = recursive_split(s, delim)
return s
def recursive_sort(s):
if isinstance(s, str):
return s
elif isinstance(s, list):
return sorted([recursive_sort(i) for i in s])
else:
raise Exception("unknown type")
def rejoin(s, delims=delim_precedence):
if len(delims) == 0:
return s
return delims[0].join(rejoin(i, delims[1:]) for i in s)
def canonicalize(s):
return rejoin(recursive_sort(split_by_precedence(s)))
print(canonicalize(example))
答案3
得分: 1
@trincot的答案虽然有效,但在维护方面存在问题,因为它在嵌套拆分和连接中硬编码了分隔符,所以如果需要更改分隔符,必须同时修改相应的拆分和连接,如果需要添加额外的分隔符,则需要嵌套添加一个拆分和连接层。
一个更通用的方法是使用递归函数逐个拆分和连接一个分隔符,然后将其余的分隔符传递给下一层递归调用:
def sort(string, separators):
sep, *rest = separators
pieces = string.split(sep)
return sep.join(sorted((sort(p, rest) for p in pieces) if rest else pieces))
这样(使用@trincot的测试用例):
example = "b,a&z&x;x1;m&f,q&c"
separators = ',&'
print(sort(example, separators))
将输出:
a&x&z,b;c&q,f&m;x1
英文:
@trincot's answer works but is a maintenance nightmare since it hard-codes the separators for nested splits and joins, so if there is a need for a change of a separator, both the corresponding split and join need to be modified, and if there is a need for an additional separator, an additional layer of split and join needs to be nested. .
A more general approach would be to use a recursive function to split by and join with one separator at a time, and pass the rest of the separators to the next level of recursive call:
def sort(string, separators):
sep, *rest = separators
pieces = string.split(sep)
return sep.join(sorted((sort(p, rest) for p in pieces) if rest else pieces))
ieces))
so that (using @trincot's test case):
example = "b,a&z&x;x1;m&f,q&c"
separators = ';,&'
print(sort(example, separators))
would output:
a&x&z,b;c&q,f&m;x1
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论