英文:
SciPy equivalent of np.minimum.outer
问题
我需要对两个(大)稀疏向量执行外积,而不是乘法,取值的最小值。我需要一种高效的方法来执行此操作。
内置的逐点乘法需要30毫秒,而我的(朴素)代码需要5分钟。
%%timeit -r 1
# c1, c2是稀疏向量的 .nonzero 部分
minl = lil_matrix((c1.shape[1], c2.shape[1]))
for i1 in c1.nonzero()[1]:
for i2 in c2.nonzero()[1]:
minl[i1, i2] = min(c1[0, i1], c2[0, i2])
英文:
I need to do the outer product of two (large) sparse vectors, taking a minimum of the values instead of multiplication. I need an efficient way to perform this.
The build-in point-wise multiplication takes 30ms, and my (naive) code takes 5 minutes
%%timeit -r 1
# c1, c2 are .nonzero of sparse vector
minl = lil_matrix((c1.shape[1], c2.shape[1]))
for i1 in c1.nonzero()[1]:
for i2 in c2.nonzero()[1]:
minl[i1, i2] = min(c1[0, i1], c2[0, i2])
答案1
得分: 1
也许我应该坚持使用[mcve],但让我们猜测一个合理的样本:
from scipy import sparse
c1 = sparse.random(1, 100, 0.2, 'csr')
c2 = sparse.random(1, 100, 0.2, 'csr')
def foo(c1, c2):
minl = sparse.lil_matrix((c1.shape[1], c2.shape[1]))
for i1 in c1.nonzero()[1]:
for i2 in c2.nonzero()[1]:
minl[i1, i2] = min(c1[0, i1], c2[0, i2])
return minl
LL1 = foo(c1, c2)
LL1
# <100x100 sparse matrix of type '<class 'numpy.float64'>'
# with 400 stored elements in List of Lists format>
对于密集数组,匹配数组长度的简单逐元素最小值为:
np.minimum(c1.A, c2.A).shape
# (1, 100)
并且通过广播,我们可以进行外部最小值计算:
np.minimum(c1.T.A, c2.A).shape
# (100, 100)
这与你的迭代答案匹配:
np.allclose(np.minimum(c1.T.A, c2.A), LL1.A)
# True
比较计算时间:
timeit LL1 = foo(c1, c2)
# 37.8 ms ± 240 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
timeit np.minimum(c1.T.A, c2.A)
# 257 µs ± 3.21 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
但稀疏矩阵不支持广播。
我建议在循环中构建一个coo矩阵。
扩展稀疏矩阵:
我可以通过乘以适当的稀疏矩阵来扩展矩阵:
(c1.T * sparse.csr_matrix(np.ones((1, 100)))).minimum(sparse.csr_matrix(np.ones((100, 1)) * c2))
这将在以下时间内运行:
1.91 ms ± 15.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
并且可以打包成一个函数:
def foo1(c1, c2):
On1 = sparse.csr_matrix(np.ones((1, c2.shape[1])))
On2 = sparse.csr_matrix(np.ones((c1.shape[1], 1)))
cc1 = c1.T * On1
cc2 = On2 * c2
return cc1.minimum(cc2)
列表迭代:
这是一个纯Python迭代版本:
def foo0(c1, c2):
data, row, col = [], [], []
cc = c1.tocoo(); d1 = cc.data.tolist(); l1 = cc.col.tolist()
cc = c2.tocoo(); d2 = cc.data.tolist(); l2 = cc.col.tolist()
for i, i1 in enumerate(l1):
for j, i2 in enumerate(l2):
data.append(min(d1[i], d2[j]))
row.append(i1)
col.append(i2)
minl = sparse.coo_matrix((data, (row, col)), shape=(c1.shape[1], c2.shape[1]))
return minl
foo0(c1, c2)
# <100x100 sparse matrix of type '<class 'numpy.float64'>'
# with 400 stored elements in COOrdinate format>
我们不必在c2
上进行迭代,而是使用数组方法:
def foo02(c1, c2):
data, row, col = [], [], []
cc = c1.tocoo(); d1 = cc.data.tolist(); l1 = cc.col.tolist()
cc = c2.tocoo(); d2 = cc.data; l2 = cc.col
for i, i1 in enumerate(l1):
data.append(np.minimum(d1[i], d2))
row.append([i1] * l2.shape[0])
col.append(l2)
data = np.hstack(data)
row = np.hstack(row)
col = np.hstack(col)
minl = sparse.coo_matrix((data, (row, col)), shape=(c1.shape[1], c2.shape[1]))
return minl
这在foo0
上有适度的时间改进,可能更好地适应规模。
你建议的更改:
def foo11(c1, c2):
On1 = c2.copy(); On1.data[:] = 1
On2 = c1.T.copy(); On2.data[:] = 1
cc1 = c1.T * On1
cc2 = On2 * c2
return cc1.minimum(cc2)
1.21 ms ± 14 µs per loop
英文:
Maybe I should insist on a [mcve], but let's make a guess as to what's a reasonable sample:
In [107]: from scipy import sparse
In [108]: c1 = sparse.random(1,100,.2,'csr')
In [109]: c2 = sparse.random(1,100,.2,'csr')
In [110]: def foo(c1,c2):
...: minl = sparse.lil_matrix((c1.shape[1], c2.shape[1]))
...: for i1 in c1.nonzero()[1]:
...: for i2 in c2.nonzero()[1]:
...: minl[i1, i2] = min(c1[0, i1], c2[0, i2])
...: return minl
...:
In [111]: LL1 = foo(c1,c2)
In [112]: LL1
Out[112]:
<100x100 sparse matrix of type '<class 'numpy.float64'>'
with 400 stored elements in List of Lists format>
dense with broadcasting
For dense arrays a simple element-wise minimum for matching array lengths is:
In [118]: np.minimum(c1.A,c2.A).shape
Out[118]: (1, 100)
and with broadcasting we can do an 'outer' minimum:
In [119]: np.minimum(c1.T.A,c2.A).shape
Out[119]: (100, 100)
and that matches your iterative answer:
In [122]: np.allclose(np.minimum(c1.T.A,c2.A),LL1.A)
Out[122]: True
And comparative times:
In [123]: timeit LL1 = foo(c1,c2)
37.8 ms ± 240 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [124]: timeit np.minimum(c1.T.A,c2.A)
257 µs ± 3.21 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
But sparse doesn't do broadcasting.
I suggested building a coo matrix in the loop
Another possibility is to 'replicate' c1.T
and c2
so we can do the minimum
.
expanded sparse matrices
I can expand the matrics by multiplying by the appropriate sparse ones matrix:
In [135]: (c1.T*sparse.csr_matrix(np.ones((1,100)))).minimum(sparse.csr_matrix(np.ones((100,1))*c2))
Out[135]:
<100x100 sparse matrix of type '<class 'numpy.float64'>'
with 400 stored elements in Compressed Sparse Column format>
In [136]: np.allclose(_.A,LL1.A)
Out[136]: True
That times at
1.91 ms ± 15.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
and packaged neatly
def foo1(c1,c2):
On1 = sparse.csr_matrix(np.ones((1,c2.shape[1])))
On2 = sparse.csr_matrix(np.ones((c1.shape[1],1)))
cc1 = c1.T*On1
cc2 = On2*c2
return cc1.minimum(cc2)
List iteration
Here's a "pure python" iterative version
def foo0(c1,c2):
data, row, col = [],[],[]
cc = c1.tocoo(); d1 = cc.data.tolist(); l1 = cc.col.tolist()
cc = c2.tocoo(); d2 = cc.data.tolist(); l2 = cc.col.tolist()
for i,i1 in enumerate(l1):
for j,i2 in enumerate(l2):
data.append(min(d1[i],d2[j]))
row.append(i1)
col.append(i2)
minl = sparse.coo_matrix((data, (row,col)), shape=(c1.shape[1],c2.shape[1]))
return minl
In [170]: foo0(c1,c2)
Out[170]:
<100x100 sparse matrix of type '<class 'numpy.float64'>'
with 400 stored elements in COOrdinate format>
In [171]: np.allclose(_.A,LL1.A)
Out[171]: True
In [172]: timeit foo0(c1,c2)
576 µs ± 11 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
We don't have to iterate on the c2
, instead using array methods
def foo02(c1,c2):
data, row, col = [],[],[]
cc = c1.tocoo(); d1 = cc.data.tolist(); l1 = cc.col.tolist()
cc = c2.tocoo(); d2 = cc.data; l2 = cc.col
for i,i1 in enumerate(l1):
data.append(np.minimum(d1[i],d2))
row.append([i1]*l2.shape[0])
col.append(l2)
data = np.hstack(data)
row = np.hstack(row)
col = np.hstack(col)
minl = sparse.coo_matrix((data, (row,col)), shape=(c1.shape[1],c2.shape[1]))
return minl
This has a modest time improvement on foo0
. It may scale better.
Your suggested change
def foo11(c1,c2):
On1 = c2.copy(); On1.data[:] = 1
On2 = c1.T.copy(); On2.data[:] = 1
cc1 = c1.T*On1
cc2 = On2*c2
return cc1.minimum(cc2)
1.21 ms ± 14 µs per loop
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论