英文:
Numpy subtracting rows of a 2D array from another 2D array without for loop
问题
我知道,如果我们尝试从2D数组A(5000,3072)中减去一个行向量v(1,3072),如果A和v具有相同的列数,那么v将被广播,但不能一次减去一堆行向量V(V的每一行都要从A的整个矩阵中减去)。
我无法想出如何一次从A中逐行减去V的行而不使用for
循环。
这是这个问题的半向量化形式,如何摆脱那个for循环呢?
例如
matrix_e = np.ones((3, 3)) * 6
matrix_f = np.array([[1, 2, 3], [4, 5, 6]])
如何获得形状为(6, 3)的matrix_g,而不使用for循环?
matrix_g = np.array([[5, 4, 3], [5, 4, 3], [5, 4, 3], [2, 1, 0], [2, 1, 0], [2, 1, 0]])
英文:
I know that if we try to subtract a row vector v(1,3072) from a 2D array A(5000,3072) if A and v does have same number of column v is broadcasted, but subtracting stack of row vectors V (each row of V having to be subtracted to the whole of A) cannot be done.
I can't figure out how to subtract V's rows one by one from A without using a for
loop.
def compute_distances_one_loop(V):
num_test = V.shape[0]
num_train = A.shape[0]
dists = np.zeros((num_test, num_train))
for i in range(num_test):
initial=np.sqrt(np.square(A-V[i,:]))
dists[i,:]=initial.sum(axis=1)
return dists
Heres the semi-vectorized form of this problem, how do I get rid of that for loop?
For example
matrix_e=np.ones((3,3))*6
matrix_f=np.array([[1,2,3],[4,5,6]])
how do I get the matrix_g of shape (6,3) without using a for loop?
matrix_g=np.array([[5,4,3],[5,4,3],[5,4,3],[2,1,0],[2,1,0],[2,1,0]])
答案1
得分: 1
如果我正确理解你的问题,你可以通过连接V的多次出现来创建一个与A大小相同的大数组:
height_A, height_V = A.shape[0], V.shape[0]
occurrences, remainder = divmod(height_A, height_V)
mask = [V for i in range(occurrences)] + [V[:remainder]]
big_V = np.concatenate(mask)
现在你可以安全地执行 A - big_V!
(我将这些步骤分开以使其更清晰,但你可以轻松地将它们合并为一个单一语句:
big_V = np.concatenate([V for i in range(A.shape[0] // V.shape[0])] + [V[:A.shape[0] % V.shape[0]])
编辑 - 现在我更好地理解你的需求:从A的整个部分中减去V的每一行。可以通过在两个数组中添加第三维来实现,就像以下图片中所示,其中A2 - V2由绿色窗格数组表示,以利用广播。
A2 = np.expand_dims(A, axis=0) # 从形状 (5000, 3072) 变为 (1, 5000, 3072)
V2 = np.expand_dims(V, axis=1) # 从形状 (500, 3072) 变为 (500, 1, 3072)
print(A2 - V2) # 广播使得结果形状为 (500, 5000, 3072)
例如,使用以下示例:
A = np.ones((3, 3)) * 6
V = np.array([[1, 2, 3], [4, 5, 6]])
print(A2 - V2)
# 数组
# [[[5., 4., 3.],
# [5., 4., 3.],
# [5., 4., 3.]],
#
# [[2., 1., 0.],
# [2., 1., 0.],
# [2., 1., 0.]]]
然后你可以计算A和V行之间的距离数组:
D = np.sqrt(np.square(A2 - V2).sum(axis=2))
# 数组
# [[7.07106781, 7.07106781, 7.07106781],
# [2.23606798, 2.23606798, 2.23606798]]
英文:
If I understood your question correctly, you can make a big array (of the same size as A) by concatenating occurrences of V:
height_A, height_V = A.shape[0], V.shape[0]
occurrences, remainder = divmod(height_A, height_V)
mask = [V for i in range(occurrences)] + [V[:remainder]]
big_V = np.concatenate(mask)
Now you can safely do A - big_V !
(I separated steps to make it clearer, but you can easily combine them into a single statement
big_V = np.concatenate([V for i in range(A.shape[0]//V.shape[0])] + [V[:A.shape[0]%V.shape[0]]])
)
Edit - I better understand what you need now: subtract EACH row of V from the whole of A. It's possible by adding a third dimension to both arrays like in the following picture, where A2 - V2 is represented by the array of green panes, to make use of broadcasting.
A2 = np.expand_dims(A, axis = 0) # from shape (5000, 3072) to (1, 5000, 3072)
V2 = np.expand_dims(V, axis = 1) # from shape (500, 3072) to (500, 1, 3072)
print (A2 - V2) # broadcasting makes the resulting shape (500, 5000, 3072)
Example, with:
A = np.ones((3,3))*6
V = np.array([[1,2,3],[4,5,6]])
print(A2 - V2)
# array([[[5., 4., 3.],
# [5., 4., 3.],
# [5., 4., 3.]],
#
# [[2., 1., 0.],
# [2., 1., 0.],
# [2., 1., 0.]]])
And you can calculate the array of distances between rows of A and V:
D = np.sqrt(np.square(A2 - V2).sum(axis = 2))
# array([[7.07106781, 7.07106781, 7.07106781],
# [2.23606798, 2.23606798, 2.23606798]])
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论