Shuffling sequences in a list (在列表中洗牌序列)

huangapple go评论65阅读模式
英文:

Shuffling sequences in a list

问题

I have a list containing sequences of various lengths. The pattern of a sequence is as follows:

x_k, y_k, ..., x_k_i, y_k_i, ... z_k

For example, a list having 4 sequences with lengths: 3, 3, 5, and 7 is as follows:

input_list = ['x_1', 'y_1', 'z_1',
'x_2', 'y_2', 'z_2',
'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3',
'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']

I need to shuffle the list, such that the order of sequences is shuffled, but the entries within a sequence is not shuffled.

For example, a candidate output would be as follows:

shuffled_list = ['x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3',
'x_1', 'y_1', 'z_1',
'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4',
'x_2', 'y_2', 'z_2']

One way to achieve this would be by saving each sequence as a separate list, and then having a nested list represent all the sequences. Then, one by one randomly removing a list (i.e., a sequence) from the nested list and appending the removed list's elements in the final shuffled list.

Is there a more efficient way to achieve the same?

英文:

I have a list containing sequences of various lengths. The pattern of a sequence is as follows:

x_k, y_k, ..., x_k_i, y_k_i, ... z_k

For example, a list having 4 sequences with lengths: 3, 3, 5, and 7 is as follows:

input_list = ['x_1', 'y_1', 'z_1', 
              'x_2', 'y_2', 'z_2', 
              'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3', 
              'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']

I need to shuffle the list, such that the order of sequences is shuffled, but the entries within a sequence is not shuffled.

For example, a candidate output would be as follows:

shuffled_list = ['x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3', 
                 'x_1', 'y_1', 'z_1',
                 'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4',
                 'x_2', 'y_2', 'z_2']

One way to achieve this would be by saving each sequence as a separate list, and then having a nested list represent all the sequences. Then, one by one randomly removing a list (i.e., a sequence) from the nested list and appending the removed list's elements in the final shuffled list.

Is there a more efficient way to achieve the same?

答案1

得分: 1

你可以使用itertools.groupby在第一个索引上对项目进行分组:

from itertools import groupby

input_list = ['x_1', 'y_1', 'z_1', 
              'x_2', 'y_2', 'z_2', 
              'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3', 
              'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']

def key(x):
    return x.split('_')[1]

group_dic = {k: list(g) for k, g in groupby(input_list, key=key)}

{
    '1': ['x_1', 'y_1', 'z_1'],
    '2': ['x_2', 'y_2', 'z_2'],
    '3': ['x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3'],
    '4': ['x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']
}

然后,你可以按照需要使用结果。至于洗牌部分:

from random import shuffle

order = list(group_dic.keys())
shuffle(order)

output_list = [entry for i in order for entry in group_dic[i]]

# 示例输出
# ['x_1', 'y_1', 'z_1', 'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4', 'x_2', 'y_2', 'z_2', 'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3']
英文:

You can group your items on the first index, with the use of itertools.groupby:

from itertools import groupby

input_list = ['x_1', 'y_1', 'z_1', 
              'x_2', 'y_2', 'z_2', 
              'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3', 
              'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']

def key(x):
    return x.split('_')[1]

group_dic = {k:list(g) for k,g in groupby(input_list, key=key)}

{'1': ['x_1', 'y_1', 'z_1'],
 '2': ['x_2', 'y_2', 'z_2'],
 '3': ['x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3'],
 '4': ['x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']}

Of course you don't HAVE to make a dictionary; you can use the results however you want.

Now on to the shuffling part:

from random import shuffle

order = list(group_dic.keys())
shuffle(order)

output_list = [entry for i in order for entry in group_dic[i]]

Example output:

['x_1', 'y_1', 'z_1', 'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4', 'x_2', 'y_2', 'z_2', 'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3']

Edit: following on @Timus' idea, if you have a list of lengths of the subgroups, you can make a list of corresponding ranges, shuffle it, then build the output_list from the shuffled ranges; for example:

from itertools import accumulate, pairwise
from random import shuffle

lengths =  [3, 3, 5, 7]
ranges = [slice(*pair) for pair in pairwise(accumulate(lengths, initial=0))]
shuffle(ranges)
output_list = sum((input_list[range] for range in ranges), start=[])

# ['x_1', 'y_1', 'z_1', 'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4', 'x_2', 'y_2', 'z_2', 'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3']

答案2

得分: 0

如果条目的格式严格如示例所示,那么可以按照 @Swifty 的评论,使用 itertools.groupby 对列表进行洗牌。

import itertools
import random

input_list = ['x_1', 'y_1', 'z_1', 
              'x_2', 'y_2', 'z_2', 
              'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3', 
              'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']

# 按它们的序列分组条目
sequences = [list(g) for k, g in itertools.groupby(input_list, lambda x: x.split('_')[1] if '_' in x else x)]

random.shuffle(sequences)

# 扁平化洗牌后的列表
shuffled_list = list(itertools.chain(*sequences))
英文:

If the format of entries is strictly as shown in the example, then following @Swifty's comment, the list can be shuffled using itertools.groupby.

import itertools
import random

input_list = ['x_1', 'y_1', 'z_1', 
              'x_2', 'y_2', 'z_2', 
              'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3', 
              'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']

# Grouping the entries by their sequence
sequences = [list(g) for k, g in itertools.groupby(input_list, lambda x: x.split('_')[1] if '_' in x else x)]

random.shuffle(sequences)

# Flattening the shuffled list
shuffled_list = list(itertools.chain(*sequences))

答案3

得分: 0

以下是您要翻译的内容:

You could do it with a sort that maps the sequence identifiers to random values. Because Python's sort is stable, the items with the same sequence identifiers will remain in the same relative order:

input_list = ['x_1', 'y_1', 'z_1', 'x_2', 'y_2', 'z_2', 'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3', 'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']

import random

seqId = {s.split("")[1]: random.random() for s in input_list}
input_list.sort(key=lambda s: seqId

展开收缩
])

print(input_list)

['x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4', 'x_2', 'y_2', 'z_2', 'x_1', 'y_1', 'z_1', 'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3']

To avoid the overhead of sorting, you could create a list of ranges corresponding to the sequences and rebuild the list after shuffling the range list:

*ends, = {s.split("_")[1]: i for i, s in enumerate(input_list, 1)}.values()
ranges = [(s, e) for s, e in zip([0] + ends, ends)]
random.shuffle(ranges)
input_list =

展开收缩
]

print(input_list)

['x_2', 'y_2', 'z_2', 'x_1', 'y_1', 'z_1', 'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3', 'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']

英文:

You could do it with a sort that maps the sequence identifiers to random values. Because Python's sort is stable, the items with the same sequence identifiers will remain in the same relative order:

input_list = ['x_1', 'y_1', 'z_1', 
              'x_2', 'y_2', 'z_2', 
              'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3', 
              'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']

import random

seqId  = {s.split("_")[1]:random.random() for s in input_list}
input_list.sort(key=lambda s:seqId
展开收缩
]) print(input_list) ['x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4', 'x_2', 'y_2', 'z_2', 'x_1', 'y_1', 'z_1', 'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3']

To avoid the orverhead of sorting, you could create a list of ranges corresponding to the sequences and rebuild the list after shuffling the range list:

*ends, = {s.split("_")[1]:i for i,s in enumerate(input_list,1)}.values()
ranges = [(s,e) for s,e in zip([0]+ends,ends)]
random.shuffle(ranges)
input_list = 
展开收缩
] print(input_list) ['x_2', 'y_2', 'z_2', 'x_1', 'y_1', 'z_1', 'x_3_1', 'y_3_1', 'x_3_2', 'y_3_2', 'z_3', 'x_4_1', 'y_4_1', 'x_4_2', 'y_4_2', 'x_4_3', 'y_4_3', 'z_4']

huangapple
  • 本文由 发表于 2023年5月7日 16:56:55
  • 转载请务必保留本文链接:https://go.coder-hub.com/76192974.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定