英文:
Python replace nested for loop
问题
以下是已翻译的代码部分:
import itertools
import numpy as np
t = np.random.rand(3,3,3)
def foo(T):
res = np.zeros((3,3,3))
for a in range(3):
for b in range(3):
for c in range(3):
idx = [a,b,c]
combinations = list(itertools.permutations(idx, len(idx)))
idx_arrays = tuple(np.array(idx) for idx in zip(*combinations))
res[a, b, c] = (1.0/len(combinations))*np.sum(T[idx_arrays])
return res
sol = foo(t)
要使代码更快,您想要替换外部的for
循环。是否可以实现这一点?理想情况下,这应该在不使用numpy的einsum
方法的情况下完成。
英文:
I have the following code:
import itertools
import numpy as np
t = np.random.rand(3,3,3)
def foo(T):
res = np.zeros((3,3,3))
for a in range(3):
for b in range(3):
for c in range(3):
idx = [a,b,c]
combinations = list(itertools.permutations(idx, len(idx)))
idx_arrays = tuple(np.array(idx) for idx in zip(*combinations))
res[a, b, c] = (1.0/len(combinations))*np.sum(T[idx_arrays])
return res
sol = foo(t)
The code above should be the equivalent to doing:
for a in range(3):
for b in range(3):
for c in range(3):
res2[a,b,c] = (1.0/6)*(
t[a,b,c]
+ t[a,c,b]
+ t[b,a,c]
+ t[b,c,a]
+ t[c, a, b]
+ t[c,b,a]
)
To make the code faster, I would like to replace the outer for loops
. Can this be done? Ideally this should be achieved without using einsum
method of numpy.
答案1
得分: 1
以下是翻译好的代码部分:
def foo(t):
res = np.zeros_like(t).astype(t.dtype)
A, B, C = t.shape
for a in range(A):
for b in range(a, B):
for c in range(b, C):
r = (1.0 / 6) * (
t[a, b, c]
+ t[a, c, b]
+ t[b, a, c]
+ t[b, c, a]
+ t[c, a, b]
+ t[c, b, a]
)
res[a, b, c] = r
res[a, c, b] = r
res[b, a, c] = r
res[b, c, a] = r
res[c, a, b] = r
res[c, b, a] = r
return res
英文:
You can start improving by not doing duplicated computations:
def foo(t):
res = np.zeros_like(t).astype(t.dtype)
A, B, C = t.shape
for a in range(A):
for b in range(a, B):
for c in range(b, C):
r = (1.0 / 6) * (
t[a, b, c]
+ t[a, c, b]
+ t[b, a, c]
+ t[b, c, a]
+ t[c, a, b]
+ t[c, b, a]
)
res[a, b, c] = r
res[a, c, b] = r
res[b, a, c] = r
res[b, c, a] = r
res[c, a, b] = r
res[c, b, a] = r
return res
答案2
得分: 1
A one-liner, using np.transpose
to vectorize:
np.stack([t.transpose(i) for i in itertools.permutations(np.arange(3), 3)]).mean(0)
Those 3
s can be changed or functionalized if you want higher dimensionality than 3
import numpy as np
from itertools import permutations as perm
import numpy as np
from itertools import permutations as perm
def foo(T):
dims = len(T.shape)
assert np.all(np.array(T.shape) == T.shape[0])
all_perms = np.stack([T.transpose(i) for i in perm(np.arange(dims), dims)])
return all_perms.mean(0)
英文:
A one-liner, using np.transpose
to vectorize:
np.stack([t.transpose(i) for i in itertools.permutations(np.arange(3), 3)]).mean(0)
Those 3
s can be changed or functionalized if you want higher dimensionality than 3
import numpy as np
from itertools import permutations as perm
import numpy as np
from itertools import permutations as perm
def foo(T):
dims = len(T.shape)
assert np.all(np.array(T.shape) == T.shape[0])
all_perms = np.stack([T.transpose(i) for i in perm(np.arange(dims), dims)])
return all_perms.mean(0)
答案3
得分: 0
我不知道这是否比三重嵌套循环更快。无论如何,你可以尝试这样做:
import itertools
import numpy as np
t = np.random.rand(3,3,3)
def foo(T):
t = np.random.rand(3,3,3)
res = np.zeros((3,3,3))
for a, b, c in set(itertools.permutations((0, 0, 0, 1, 1, 1, 2, 2, 2), 3)):
idx = [a,b,c]
combinations = list(itertools.permutations(idx, len(idx)))
idx_arrays = tuple(np.array(idx) for idx in zip(*combinations))
res[a, b, c] = (1.0/len(combinations))*np.sum(t[idx_arrays])
return res
sol = foo(t)
英文:
I don't know whether this is faster than 3 nested for loops. Anyways you can try this:
import itertools
import numpy as np
t = np.random.rand(3,3,3)
def foo(T):
t = np.random.rand(3,3,3)
res = np.zeros((3,3,3))
for a, b, c in set(itertools.permutations((0, 0, 0, 1, 1, 1, 2, 2, 2), 3)):
idx = [a,b,c]
combinations = list(itertools.permutations(idx, len(idx)))
idx_arrays = tuple(np.array(idx) for idx in zip(*combinations))
res[a, b, c] = (1.0/len(combinations))*np.sum(t[idx_arrays])
return res
sol = foo(t)
Here, instead of using 3 nested for loops
I am creating unique permutations
of 3 elements
from the tuple (0, 0, 0, 1, 1, 1, 2, 2, 2)
.
答案4
得分: 0
这段代码的大致含义是:通过改变np.indices
的输出形状来生成idx_arrays
,然后使用高级索引从原始数组中汇总值,并除以排列组合的数量。最后,将结果重新形状为一个3x3x3的数组。
这段代码以更高效的方式产生与原始代码相同的结果,利用了NumPy的向量化操作。
在我的机器上,foo(t)
耗时约261微秒,foo2(t)
耗时约36.7微秒。
英文:
What about this solution?
def foo2(T):
indices = np.indices((3, 3, 3)).reshape(3, -1)
all_permutations = np.array(list(itertools.permutations(indices, len(indices))))
idx_arrays = tuple(np.array(idx) for idx in zip(*all_permutations))
summed = np.sum(T[tuple(idx_arrays)], axis=0)
res = (1.0 / len(all_permutations)) * summed
return res.reshape(3, 3, 3)
In short, you reshape the output of np.indices
to generate the idx_arrays. Then you sum the values from the original array using advanced indexing and divide by the number of permutations. Finally, reshape the result back to a 3x3x3 array.
This code produces the same results as the original one but in a more efficient manner by leveraging the power of NumPy's vectorized operations.
On my machine:
%%timeit foo(t)
%%timeit foo2(t)
261 µs ± 4.36 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
36.7 µs ± 296 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论