英文:
How to efficiently generate all convex combinations (meaning they sum to 1.0) in Python / NumPy?
问题
我需要生成所有值总和为1.0的三维数组,即它们是凸组合。
假设每个元素可以是`[0.0, 0.2, 0.4, 0.6, 0.8, 1.0]`中的一个。因此,像`[0.0,0.4,0.6]`或`[0.2,0.6,0.2]`这样的组合是正确的,因为它们总和为1.0,但像`[1.0,0.4,0.2]`这样的组合是不正确的,因为它总和为1.6。
从[这个问题和答案](https://stackoverflow.com/questions/1208118/using-numpy-to-build-an-array-of-all-combinations-of-two-arrays)中,我知道如何生成给定数组的所有组合。因此,我可以这样做:
```python
ratios = [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]
result = np.stack(np.meshgrid(ratios, ratios, ratios), -1).reshape(-1, 3)
然后我可以简单地过滤只有那些总和为1.0的组合:
result[np.isclose(np.sum(result, axis=1), 1.0)]
然而,对于可能轻松有数十亿组合但只有一小部分(小于1/1000)满足凸条件的高维情况,这很快变得计算密集。
还有一个类似的问题,关于找到所有可能的数字组合以达到给定总和。然而,在那种情况下,维度不是固定的,因此[1.0]
或[0.2,0.2,0.2,0.2,0.2]
都是有效的解决方案。
是否有一种更高效的方法,假设总和和维度都是固定的?
<details>
<summary>英文:</summary>
I need to generate all 3-dimensional arrays whose values sum up to 1.0, i.e., they are convex combinations.
Let's assume that each element can be one of `[0.0, 0.2, 0.4, 0.6, 0.8, 1.0]`. Hence, combinations like `[0.0,0.4,0.6]` or `[0.2,0.6,0.2]` are correct, as they sum up to 1.0, but a combination like `[1.0,0.4,0.2]` would be incorrect as it sums up to 1.6.
From [this question and answer](https://stackoverflow.com/questions/1208118/using-numpy-to-build-an-array-of-all-combinations-of-two-arrays), I know how to generate all combinations of given arrays. Hence, I could do:
```python
ratios = [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]
result = np.stack(np.meshgrid(ratios, ratios, ratios), -1).reshape(-1, 3)
and then I could simply filter only those that do sum up to 1.0:
result[np.isclose(np.sum(result, axis=1), 1.0)]
However, this quickly becomes computationally intensive for highly-dimensional scenarios that may easily have billions of combinations, but only a tiny portion (less than 1/1000) satisfies the convex condition.
There is also a similar problem of finding all possible combinations of numbers to reach a given sum. However, in that scenario, the dimension is not fixed, hence [1.0]
or [0.2,0.2,0.2,0.2,0.2]
would both be valid solutions.
Is there a more efficient way, which assumes a fixed sum and fixed dimension?
答案1
得分: 3
分支与界限法(Branch & Bound)
另一种(几乎是学术性质的)解决方案是使用分支与界限法(branch&bound)。
分支与界限法是一种递归地探索所有可能组合并保留正确组合的方法。但由于它是递归进行的,这些可能的组合在一个隐式树中被探索,这就给了我们机会丢弃整个子树,而无需探索它们。
分支与界限法的理想情况是当你能够说:“有两种解决方案。那些带有 xxx 的,以及那些不带 xxx 的。而在这些解决方案中,又有两种。那些带有 yyy 的,以及那些不带 yyy 的。等等。” 在这里,“有两种解决方案。那些带有 0.0 的,以及那些不带 0.0 的。而在第一种情况下,又有两种,那些带有另一个 0.0 的,以及那些只有一个。在第二种情况下,有两种,那些带有 0.2 的,以及那些不带。等等。”
在你的情况下,代码可以是这样的:
import math
eps=1e-5
ratios = [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]
def convbb(a, t=1.0, nc=3, p=[]):
if nc==0:
if t<eps:
return
return []
if len(a)==0:
return []
sols=[]
if a[0]<=t+eps:
sols = convbb(a, t-a[0], nc-1, p+[a[0]])
return sols + convbb(a[1:], t, nc, p)
convbb(ratios[::-1])
#[[1.0, 0.0, 0.0], [0.8, 0.2, 0.0], [0.6, 0.4, 0.0], [0.6, 0.2, 0.2], [0.4, 0.4, 0.2]]
与itertools方法相比,这种方法的优势在于对于你的例子来说并不明显。在我的机器上,它的速度慢了一倍。它探索的组合更少(itertools方法探索所有组合;而这种方法显然从不探索以[1.0, 0.2]
开始的任何解决方案,甚至不去选择第三个值)。但因为它是纯Python(加上递归,舒服但不是最优;加上我使用的列表连接,应该被替换为一个静态列表;加上其他非最优的编码),所以它更慢。
它的优势体现在有更多比例的情况下。例如,对于150个值,这段代码要快两倍。
而且,更重要的是,在结果中有更多值的情况下,它的优势就更明显了。对于3个值,提前结束探索的优势从未明显,因为我至少需要在部分组合中有2个值,才能知道无需选择第三个值。因此,我唯一能够削减的是选择第三个解决方案的问题。
但是如果你用它来找到6个值的组合,那么,itertools方法的消耗会急剧增加。而这个方法的消耗保持在可以接受的范围内。在我的电脑上,对于这个解决方案,它需要5秒,而对于itertools方法,需要900秒。对于7个值,这个方法不到10秒,而itertools方法在当天之前都无法完成(不管你在哪个时区:D)。
均匀间隔的比例
但在我打字的时候,你在评论中提到比例是均匀间隔的。这改变了一切!因为如果它们是从0到1均匀间隔的,那么我们事先就知道所有有效的解决方案。它们是:
[(ratios[i], ratios[j], ratios[-i-j-1]) for i in range(len(ratios)) for j in range(i, (len(ratios)-i+1)//2)]
因为在这种情况下,你的问题等同于在范围[0,len(ratios)]内找到3个整数的和,其总和是len(ratios)。
对于超过nc=3个数的情况,相同的逻辑适用,只是递归更多次。
对于一个未知的nc,可以编写一个递归函数,但比我的上一个函数简单得多。它只是要找到整数 a₀≤a₁≤...≤aₙ
,使得 a₀+a₁+...+aₙ=n。
def recursFind(N, nc=3, i=0, t=0, p=[]):
if nc==1:
# 无需探索,最后一个值是N-t
if N-t>=i:
yield p+[N-t]
else:
pass # p+[N-t] 是一个解决方案,但已经以不同顺序给出
elif i*nc+t>N:
return # 无法找到nc个值>=i(存在一些<i的值,但已经给出了一些解决方案)
else:
for j in range(i, N):
yield from recursFind(N, nc-1, j, t+j, p+[j])
[[ratios[i] for i in idx] for idx in recurs
<details>
<summary>英文:</summary>
# Branch & Bound
Another (almost academic) solution is to use branch&bound.
Branch&bound is a way to explore all possible combinations recursively, and keep the correct ones. But because it is done recursively, those possible combinations are explored in an implicit tree, giving the opportunity to discard whole subtrees without the need to explore them.
The ideal situation for branch&bound is when you can say "there are 2 kinds of solutions. Those, with xxx, and those without. And among those solutions, there are 2 kinds of solutions. Those with yyy, and those without. Etc.". Here "there are 2 kinds of solutions. Those with 0.0, and those without 0.0. And among the first, there are 2 kinds, those with another 0.0, and those with only one. Among the second, 2 kinds, those with 0.2, and those without. Etc."
In your case the code could be
```python
import math
eps=1e-5
ratios = [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]
def convbb(a, t=1.0, nc=3, p=[]):
if nc==0:
if t<eps:
return
return []
if len(a)==0:
return []
sols=[]
if a[0]<=t+eps:
sols = convbb(a, t-a[0], nc-1, p+[a[0]])
return sols + convbb(a[1:], t, nc, p)
convbb(ratios[::-1])
#[[1.0, 0.0, 0.0], [0.8, 0.2, 0.0], [0.6, 0.4, 0.0], [0.6, 0.2, 0.2], [0.4, 0.4, 0.2]]
The advantage of this over itertools method is not obvious for your example. On my machine, it is twice slower.
It explores less combinations (itertools method explore all of them ; this one obviously never explore any solution starting with [1.0, 0.2]
, without even bothering to choose a 3rd value). But because it is pure python (plus recursion, which is comfortable, but not optimal ; plus my usage of list concatenation, that should be replaced by a static list ; plus other non-optimal coding), it is slower.
Its advantage shows for bigger number ratios.
For example, for 150 values, this code is twice faster.
And, more importantly, it shows for bigger number of values in result. With 3, advantage of cutting exploration before the end is never obvious, since I need to have at least 2 values in a partial combination to see that there is no need to choose a 3rd. So, all I can cut, is the choice of a 3rd solution.
But if you use, for example, this to find combinations of 6 values, then, itertools method is explosively expansive. When this one remains manageable.
On my computer, is is 5 seconds for this solution, vs 900 seconds for itertools one. For 7 values, this solutions is less than 10 seconds. Itertools one stand no chance to finish before the day (whatever timezone you live in :D)
Evenly spaced ratios
But while I was typing that, you said in a comment that ratios were evenly spaced. That changes everything!
Because if they are evenly spaced, from 0 to 1, then we know in advance all the working solutions.
They are
[(ratios[i], ratios[j], ratios[-i-j-1]) for i in range(len(ratios)) for j in range(i, (len(ratios)-i+1)//2)]
Because in that case, your problem is equivalent as finding the sum of 3 integers in range [0, len(ratios)] whose total is len(ratios).
For more than nc=3 numbers, the same logic apply, with more recursion.
For an unknow nc, a recursive function can also be written, but far simpler than my previous one. It is just about finding integers a₀≤a₁≤...≤aₙ
such as a₀+a₁+...+aₙ=n.
def recursFind(N, nc=3, i=0, t=0, p=[]):
if nc==1:
# No need to explore, last value is N-t
if N-t>=i:
yield p+[N-t]
else:
pass # p+[N-t] is a solution, but it has already been given in another order
elif i*nc+t>N:
return # impossible to find nc values>=i (there are some <i. But that would yield to already given solutions)
else:
for j in range(i, N):
yield from recursFind(N, nc-1, j, t+j, p+[j])
[[ratios[i] for i in idx] for idx in recursFind(len(ratios)-1, nc)]
That's one possible implementation. It is really fast compared to others (again, because it is not the same problem, once you said ratios are evenly spaced). And it could be faster, because I was lazy (I could have could have computed upper bound instead of N
for line for j in range(i,N)
, like it did for case nc=3, removing the need to cut for test i*nc+t>N
). But it is fast enough to make my point.
tl;dr
If you have just a reasonable set of ratios, and searching for no more than 3 of them, then, itertools version is the way to go
If you have more ratios, and more importantly, may want to find combinations of more than 3 of thems, then branch&bound solution is the way to go.
Unless, as you implied in comment, ratios are evenly spaced from 0 to 1. In which case, it is a much simpler problem, and my last solution (or even better ones) are what you are looking for.
答案2
得分: 1
你有没有看过 itertools 模块?类似这样的代码:
[c for c in itertools.combinations_with_replacement([0.0, 0.2, 0.4, 0.6, 0.8, 1.0], r=3) if math.isclose(sum(c), 1.0)]
[(0.0, 0.0, 1.0), (0.0, 0.2, 0.8), (0.0, 0.4, 0.6), (0.2, 0.2, 0.6), (0.2, 0.4, 0.4)]
也可以工作。
从使用 timeit
进行的快速测试来看,使用 itertools
大约快了 8 倍:
python3 -m timeit '[c for c in itertools.combinations_with_replacement([0.0, 0.2, 0.4, 0.6, 0.8, 1.0], r=3) if sum(c) == 1.0]'
50000 loops, best of 5: 5.89 usec per loop
相对于 numpy
:
python3 -m timeit -s 'import numpy as np; ratios = [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]' 'result = np.stack(np.meshgrid(ratios, ratios, ratios), -1).reshape(-1, 3); result[np.sum(result, axis=1) == 1.0]'
5000 loops, best of 5: 48 usec per loop
英文:
Have you given a look at the itertools module? Something like:
>>> [c for c in itertools.combinations_with_replacement([0.0, 0.2, 0.4, 0.6, 0.8, 1.0], r=3) if math.isclose(sum(c), 1.0)]
[(0.0, 0.0, 1.0), (0.0, 0.2, 0.8), (0.0, 0.4, 0.6), (0.2, 0.2, 0.6), (0.2, 0.4, 0.4)]
would also work.
From quick tests with timeit
, it looks like using itertools
is roughly 8 times faster:
python3 -m timeit '[c for c in itertools.combinations_with_replacement([0.0, 0.2, 0.4, 0.6, 0.8, 1.0], r=3) if sum(c) == 1.0]'
50000 loops, best of 5: 5.89 usec per loop
than numpy
:
python3 -m timeit -s 'import numpy as np; ratios = [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]' 'result = np.stack(np.meshgrid(ratios, ratios, ratios), -1).reshape(-1, 3); result[np.sum(result, axis=1) == 1.0]'
5000 loops, best of 5: 48 usec per loop
答案3
得分: 1
由于这些值显然均匀分布且从0开始,这等同于一个整数问题:我们要找到三个有序值,它们的总和为5。除了避免浮点不精确性外,这在我的测试中提供了显著的性能提升(即使不考虑math.isclose
的开销)。我将展示解决这个问题的代码,并相信它可以根据需要进行调整。
整数实现的Numpy和itertools
方法供参考:
import numpy as np
from itertools import combinations_with_replacement
def convex_itertools(count, total):
candidates = combinations_with_replacement(range(total + 1), r=count)
return [c for c in candidates if sum(c) == total]
def convex_numpy(count, total):
mesh = np.meshgrid(*([range(total + 1)] * count))
candidates = np.stack(mesh, -1).reshape(-1, count)
return candidates[np.sum(candidates, axis=1) == total]
在我的机器上:
>>> from timeit import timeit
>>> timeit('convex_itertools(3, 5)', globals=globals(), number=10000)
0.08309134596493095
>>> timeit('convex_numpy(3, 5)', globals=globals(), number=10000)
0.8170730160200037
这里的主要问题是算法问题。使用meshgrid
生成的结果是无序的,因此生成的结果比预期的要多得多。即使使用itertools.combinations_with_replacement
,也需要进行大量的过滤。对于示例问题大小,已生成了56个有序候选(216个无序候选),这将被过滤为仅有5个(分别为21个)。
因此,最好直接生成所需的结果,而不是通过过滤生成。为此,我将使用递归生成器:
def convex_manual(count, total):
if count == 1:
if total >= 0:
yield [total]
return
for first in range((total // count) + 1):
remainder = total - (first * count)
for result in convex_manual(count - 1, remainder):
yield [first] + [first + r for r in result]
我们的基本情况是当组合只包含一个值时:要么有一个解决方案(只使用完整的total
,当它是非负数且是合法值时),要么没有解决方案(递归中已选择的值总和超过所需的总数 - 虽然这应该是不可能的,因为递归是如何构造的,除非初始值有问题)。对于多于一个值的组合,我们考虑要使用的可能第一个值(它不能超过平均值),递归地找到在限制范围内分配多余值的方法(考虑到每个数字必须至少与第一个数字一样大,否则输出将不是有序的),并将这些递归结果转化为当前步骤的结果。
让我们测试一下:
>>> list(convex_manual(3, 5))
[[0, 0, 5], [0, 1, 4], [0, 2, 3], [1, 1, 3], [1, 2, 2]]
>>> list(convex_manual(3, 10))
[[0, 0, 10], [0, 1, 9], [0, 2, 8], [0, 3, 7], [0, 4, 6], [0, 5, 5], [1, 1, 8], [1, 2, 7], [1, 3, 6], [1, 4, 5], [2, 2, 6], [2, 3, 5], [2, 4, 4], [3, 3, 4]]
即使对于小输入示例,尽管需要组装许多列表并使用递归生成器,它仍然更快:
>>> timeit('list(convex_manual(3, 5))', globals=globals(), number=10000)
0.07173079502535984
当然,考虑递归是如何工作的,可以为其他方法提供改进的见解。特别是:最后一个值只可能有一种可能,因此我们应该仅为其他值生成候选值,然后看它们是否导致合法组合。因此:
import numpy as np
from itertools import combinations_with_replacement
def convex_itertools_opt(count, total):
candidates = combinations_with_replacement(range(total + 1), r=count-1)
return [
c + (remainder,) for c in candidates
if (remainder := total - sum(c)) >= c[-1]
]
def convex_numpy_opt(count, total):
mesh = np.meshgrid(*([range(total + 1)] * (count-1)))
candidates = np.stack(mesh, -1).reshape(-1, (count-1))
# 此处的技巧参考:
# https://stackoverflow.com/questions/8486294
candidates = np.c_[candidates, total - np.sum(candidates, axis=1)]
return candidates[candidates[:, -1] >= 0]
这对于itertools
方法得到更好的结果:
>>> timeit('convex_itertools_opt(3, 5)', globals=globals(), number=10000)
0.053549989010207355
但对于更大的问题规模会出问题:
>>> timeit('convex_itertools_opt(5, 10)', globals=globals(), number=10000)
1.7618601340218447
>>> timeit('list(convex_manual(5, 10))', globals=globals(), number=10000)
0.6665365709923208
并在Numpy中产生太多开销:
>>> timeit('convex_numpy_opt(3, 5)', globals=globals(), number=10000)
0.9056216909666546
与人们可能期望的相反,尝试在每一步都使用列表和列表推
英文:
Since the values apparently are evenly spaced and start at 0, this is equivalent to a problem in integers: we want to find three ordered values from range(6)
which sum to 5
. Aside from avoiding floating-point inaccuracy, this offers a significant performance improvement in my testing (even without considering the overhead of math.isclose
). I will show code for this problem, and trust that it can be adapted as needed.
For reference, integer implementations of the Numpy and itertools
approaches:
import numpy as np
from itertools import combinations_with_replacement
def convex_itertools(count, total):
candidates = combinations_with_replacement(range(total + 1), r=count)
return [c for c in candidates if sum(c) == total]
def convex_numpy(count, total):
mesh = np.meshgrid(*([range(total + 1)] * count))
candidates = np.stack(mesh, -1).reshape(-1, count)
return candidates[np.sum(candidates, axis=1) == total]
On my machine:
>>> from timeit import timeit
>>> timeit('convex_itertools(3, 5)', globals=globals(), number=10000)
0.08309134596493095
>>> timeit('convex_numpy(3, 5)', globals=globals(), number=10000)
0.8170730160200037
The primary problem here is algorithmic. Using meshgrid
produces unordered results, and as such produces many more results than desired. But even using itertools.combinations_with_replacement
, there is a lot of filtering to do. For the sample problem size, already 56 ordered candidates (216 unordered) are generated, which will filter down to only 5 (21, respectively).
Thus, it would be preferable to generate the desired results directly, rather than via filtering. For this, I will use a recursive generator:
def convex_manual(count, total):
if count == 1:
if total >= 0:
yield [total]
return
for first in range((total // count) + 1):
remainder = total - (first * count)
for result in convex_manual(count - 1, remainder):
yield [first] + [first+r for r in result]
Our base case is when the combinations consist of a single value: there is either a single solution (just use the full total
, when it's non-negative and thus a legal value), or no solutions (previously chosen values in the recursion already add up to more than the total desired amount - though this should be impossible from how the recursion is constructed, except with bad initial values). For combinations of more than one value, we consider the possible first values to use (it cannot exceed the average value), recursively find the ways to distribute the excess within our limit (after accounting for the fact that each number must be at least as large as the first, or else the output will not be ordered), and translate those recursive findings into results for the current step.
Let's test it:
>>> list(convex_manual(3, 5))
[[0, 0, 5], [0, 1, 4], [0, 2, 3], [1, 1, 3], [1, 2, 2]]
>>> list(convex_manual(3, 10))
[[0, 0, 10], [0, 1, 9], [0, 2, 8], [0, 3, 7], [0, 4, 6], [0, 5, 5], [1, 1, 8], [1, 2, 7], [1, 3, 6], [1, 4, 5], [2, 2, 6], [2, 3, 5], [2, 4, 4], [3, 3, 4]]
Even for the small input example, and despite the overhead of assembling many lists and using a recursive generator, this is faster:
>>> timeit('list(convex_manual(3, 5))', globals=globals(), number=10000)
0.07173079502535984
Of course, considering how the recursion works gives some insight into improvements for the other approaches. In particular: there is only ever one possibility for the last value, so we should only generate candidates for the other values and then see if they lead to a legal combination. Thus:
import numpy as np
from itertools import combinations_with_replacement
def convex_itertools_opt(count, total):
candidates = combinations_with_replacement(range(total + 1), r=count-1)
return [
c + (remainder,) for c in candidates
if (remainder := total - sum(c)) >= c[-1]
]
def convex_numpy_opt(count, total):
mesh = np.meshgrid(*([range(total + 1)] * (count-1)))
candidates = np.stack(mesh, -1).reshape(-1, (count-1))
# Reference for the technique here:
# https://stackoverflow.com/questions/8486294
candidates = np.c_[candidates, total - np.sum(candidates, axis=1)]
return candidates[candidates[:, -1] >= 0]
Which does get better results for the itertools
approach:
>>> timeit('convex_itertools_opt(3, 5)', globals=globals(), number=10000)
0.053549989010207355
but of course will break down for larger problem sizes:
>>> timeit('convex_itertools_opt(5, 10)', globals=globals(), number=10000)
1.7618601340218447
>>> timeit('list(convex_manual(5, 10))', globals=globals(), number=10000)
0.6665365709923208
and incurs too much overhead in Numpy:
>>> timeit('convex_numpy_opt(3, 5)', globals=globals(), number=10000)
0.9056216909666546
Contrary to what one might expect, trying to use lists and list comprehensions at every step (rather than using a recursive generator and explicitly creating a list from the top-level result) seems to have little effect on performance, regardless of the problem size. I tried using Numpy tools for gathering the results and performance got much, much worse and the code was also much harder to understand.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论