英文:
Pythonic way to group values from a list based on values from another list
问题
有两个列表:
List_A = [1, 25, 40]
List_B = [2, 19, 23, 26, 30, 32, 34, 36]
我想生成一个列表的列表,通过确定它们是否在列表A的值之间,将列表B中的值分组。所以在这个例子中,列表B将被分成:
[[2,19,23], [26,30,32,34,36]]
在Python中有没有一种干净的方法来实现这一点,而不使用多层嵌套的for循环?
尝试了一个混乱的双层嵌套循环结构,对它的臃肿程度感到不满(由于可读性差)。
英文:
I have 2 lists:
List_A = [1, 25, 40]
List_B = [2, 19, 23, 26, 30, 32, 34, 36]
I want to generate a list of lists such that I group values in list B by determining if they are in between values in list A. So in this example, list B would be grouped into:
[[2,19,23], [26,30,32,34,36]]
Is there any clean way in python to achieve this without multiple nested for loops?
Tried a messy double nested loop structure, was not pleased with how clunky it was (due to lack of readability).
答案1
得分: 0
这是我能想到的编写代码的最简单方法。
result = []
for start, end in zip(List_A, List_A[1:]):
result.append([i for i in List_B if start <= i < end])
这是O(NxM)的时间复杂度,所以对于大型列表来说效率不是很高。
如果对List_B
进行排序(我假设List_A
已经排序),并同时遍历它们,可以使代码更高效,但会更加复杂。
英文:
This is the simplest way I can think of to code it.
result = []
for start, end in zip(List_A, List_A[1:]):
result.append([i for i in List_B if start <= i < end])
It's O(NxM), so not very efficient for large lists.
You could make it more efficient by sorting List_B
(I assume List_A
is already sorted) and stepping through both of them together, but it will be more complicated.
答案2
得分: 0
根据插入到List_A
中的索引,将List_B
分组。标准库提供了bisect
模块中的功能,通过使用标准的二分算法来确定值应该放在哪里;它还在itertools
模块中提供了根据某个谓词("key"函数)将输入序列中的相邻值分组的功能。
代码示例如下:
from bisect import bisect
from itertools import groupby
List_A = [1, 25, 40]
List_B = [2, 19, 23, 26, 30, 32, 34, 36]
groups = groupby(List_B, key=lambda x: bisect(List_A, x))
print([list(group) for key, group in groups])
这将得到所请求的结果:[[2, 19, 23], [26, 30, 32, 34, 36]]
。
bisect.bisect
是 bisect.bisect_right
的别名;也就是说,List_B
中与 List_A
中的值相等的值将放在稍后列表的开头。如果要将其放在前一个列表的末尾,可以使用 bisect.bisect_left
。
bisect.bisect
也依赖于List_A
自然排序。
itertools.groupby
将分组相邻的值;对于属于同一“bin”但被不同“bin”的值分开的值,它将创建单独的组。如果这是一个问题,请先对输入进行排序。
这将具有时间复杂度 O(N * lg M),其中 N 是 List_B
的长度,M 是 List_A
的长度。也就是说:在箱的数量中找到一个箱的时间复杂度为对数级别,这个工作对每个要进行分组的值都会重复进行。
这不会生成空列表,如果有一个应该是空的箱;在这个示例中,列表推导式会忽略对 List_A
的实际索引。
英文:
Group List_B
according to the index that they would have, if inserted into List_A
. The standard library provides functionality in the bisect
module to figure out (by using a standard bisection algorithm) where the value would go; it provides functionality in the itertools
module to group adjacent values in an input sequence, according to some predicate ("key" function).
This looks like:
from bisect import bisect
from itertools import groupby
List_A = [1, 25, 40]
List_B = [2, 19, 23, 26, 30, 32, 34, 36]
groups = groupby(List_B, key=lambda x: bisect(List_A, x))
print([list(group) for key, group in groups])
which gives [[2, 19, 23], [26, 30, 32, 34, 36]]
as requested.
bisect.bisect
is an alias for bisect.bisect_right
; that is, a value in List_B
that is equal to a value from List_A
will be put at the beginning of a later list. To have it as the end of the previous list instead, use bisect.bisect_left
.
bisect.bisect
also relies on List_A
being sorted, naturally.
itertools.groupby
will group adjacent values; it will make separate groups for values that belong in the same "bin" but are separated by values that belong in a different "bin". If this is an issue, sort the input first.
This will be O(N * lg M) where N is the length of List_B
and M is the length of List_A
. That is: finding a bin takes logarithmic time in the number of bins, and this work is repeated for each value to be binned.
This will not generate empty lists if there is a bin that should be empty; the actual indices into List_A
are ignored by the list comprehension in this example.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论