英文:
Shuffling sequences in a list
问题
I have a list containing sequences of various lengths. The pattern of a sequence is as follows:
x_k, y_k, ..., x_k_i, y_k_i, ... z_k
For example, a list having 4 sequences with lengths: 3, 3, 5, and 7 is as follows:
input_list = ['x_1', 'y_1', 'z_1',
'x_2', 'y_2', 'z_2',
'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3',
'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']
I need to shuffle the list, such that the order of sequences is shuffled, but the entries within a sequence is not shuffled.
For example, a candidate output would be as follows:
shuffled_list = ['x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3',
'x_1', 'y_1', 'z_1',
'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4',
'x_2', 'y_2', 'z_2']
One way to achieve this would be by saving each sequence as a separate list, and then having a nested list represent all the sequences. Then, one by one randomly removing a list (i.e., a sequence) from the nested list and appending the removed list's elements in the final shuffled list.
Is there a more efficient way to achieve the same?
英文:
I have a list containing sequences of various lengths. The pattern of a sequence is as follows:
x_k, y_k, ..., x_k_i, y_k_i, ... z_k
For example, a list having 4 sequences with lengths: 3, 3, 5, and 7 is as follows:
input_list = ['x_1', 'y_1', 'z_1',
'x_2', 'y_2', 'z_2',
'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3',
'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']
I need to shuffle the list, such that the order of sequences is shuffled, but the entries within a sequence is not shuffled.
For example, a candidate output would be as follows:
shuffled_list = ['x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3',
'x_1', 'y_1', 'z_1',
'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4',
'x_2', 'y_2', 'z_2']
One way to achieve this would be by saving each sequence as a separate list, and then having a nested list represent all the sequences. Then, one by one randomly removing a list (i.e., a sequence) from the nested list and appending the removed list's elements in the final shuffled list.
Is there a more efficient way to achieve the same?
答案1
得分: 1
你可以使用itertools.groupby
在第一个索引上对项目进行分组:
from itertools import groupby
input_list = ['x_1', 'y_1', 'z_1',
'x_2', 'y_2', 'z_2',
'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3',
'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']
def key(x):
return x.split('_')[1]
group_dic = {k: list(g) for k, g in groupby(input_list, key=key)}
{
'1': ['x_1', 'y_1', 'z_1'],
'2': ['x_2', 'y_2', 'z_2'],
'3': ['x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3'],
'4': ['x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']
}
然后,你可以按照需要使用结果。至于洗牌部分:
from random import shuffle
order = list(group_dic.keys())
shuffle(order)
output_list = [entry for i in order for entry in group_dic[i]]
# 示例输出
# ['x_1', 'y_1', 'z_1', 'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4', 'x_2', 'y_2', 'z_2', 'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3']
英文:
You can group your items on the first index, with the use of itertools.groupby:
from itertools import groupby
input_list = ['x_1', 'y_1', 'z_1',
'x_2', 'y_2', 'z_2',
'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3',
'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']
def key(x):
return x.split('_')[1]
group_dic = {k:list(g) for k,g in groupby(input_list, key=key)}
{'1': ['x_1', 'y_1', 'z_1'],
'2': ['x_2', 'y_2', 'z_2'],
'3': ['x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3'],
'4': ['x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']}
Of course you don't HAVE to make a dictionary; you can use the results however you want.
Now on to the shuffling part:
from random import shuffle
order = list(group_dic.keys())
shuffle(order)
output_list = [entry for i in order for entry in group_dic[i]]
Example output:
['x_1', 'y_1', 'z_1', 'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4', 'x_2', 'y_2', 'z_2', 'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3']
Edit: following on @Timus' idea, if you have a list of lengths of the subgroups, you can make a list of corresponding ranges, shuffle it, then build the output_list from the shuffled ranges; for example:
from itertools import accumulate, pairwise
from random import shuffle
lengths = [3, 3, 5, 7]
ranges = [slice(*pair) for pair in pairwise(accumulate(lengths, initial=0))]
shuffle(ranges)
output_list = sum((input_list[range] for range in ranges), start=[])
# ['x_1', 'y_1', 'z_1', 'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4', 'x_2', 'y_2', 'z_2', 'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3']
答案2
得分: 0
如果条目的格式严格如示例所示,那么可以按照 @Swifty 的评论,使用 itertools.groupby
对列表进行洗牌。
import itertools
import random
input_list = ['x_1', 'y_1', 'z_1',
'x_2', 'y_2', 'z_2',
'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3',
'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']
# 按它们的序列分组条目
sequences = [list(g) for k, g in itertools.groupby(input_list, lambda x: x.split('_')[1] if '_' in x else x)]
random.shuffle(sequences)
# 扁平化洗牌后的列表
shuffled_list = list(itertools.chain(*sequences))
英文:
If the format of entries is strictly as shown in the example, then following @Swifty's comment, the list can be shuffled using itertools.groupby
.
import itertools
import random
input_list = ['x_1', 'y_1', 'z_1',
'x_2', 'y_2', 'z_2',
'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3',
'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']
# Grouping the entries by their sequence
sequences = [list(g) for k, g in itertools.groupby(input_list, lambda x: x.split('_')[1] if '_' in x else x)]
random.shuffle(sequences)
# Flattening the shuffled list
shuffled_list = list(itertools.chain(*sequences))
答案3
得分: 0
以下是您要翻译的内容:
You could do it with a sort that maps the sequence identifiers to random values. Because Python's sort is stable, the items with the same sequence identifiers will remain in the same relative order:
input_list = ['x_1', 'y_1', 'z_1', 'x_2', 'y_2', 'z_2', 'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3', 'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']
import random
seqId = {s.split("")[1]: random.random() for s in input_list}
input_list.sort(key=lambda s: seqId
print(input_list)
['x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4', 'x_2', 'y_2', 'z_2', 'x_1', 'y_1', 'z_1', 'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3']
To avoid the overhead of sorting, you could create a list of ranges corresponding to the sequences and rebuild the list after shuffling the range list:
*ends, = {s.split("_")[1]: i for i, s in enumerate(input_list, 1)}.values()
ranges = [(s, e) for s, e in zip([0] + ends, ends)]
random.shuffle(ranges)
input_list =
print(input_list)
['x_2', 'y_2', 'z_2', 'x_1', 'y_1', 'z_1', 'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3', 'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']
英文:
You could do it with a sort that maps the sequence identifiers to random values. Because Python's sort is stable, the items with the same sequence identifiers will remain in the same relative order:
input_list = ['x_1', 'y_1', 'z_1',
'x_2', 'y_2', 'z_2',
'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3',
'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']
import random
seqId = {s.split("_")[1]:random.random() for s in input_list}
input_list.sort(key=lambda s:seqId展开收缩])
print(input_list)
['x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4',
'x_2', 'y_2', 'z_2',
'x_1', 'y_1', 'z_1',
'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3']
To avoid the orverhead of sorting, you could create a list of ranges corresponding to the sequences and rebuild the list after shuffling the range list:
*ends, = {s.split("_")[1]:i for i,s in enumerate(input_list,1)}.values()
ranges = [(s,e) for s,e in zip([0]+ends,ends)]
random.shuffle(ranges)
input_list = 展开收缩]
print(input_list)
['x_2', 'y_2', 'z_2',
'x_1', 'y_1', 'z_1',
'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3',
'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论