英文:
Numpy mask : dimension loss
问题
我遇到了关于mask和numpy的问题。我有一个3D张量,尝试仅选择其中的特定单元。
更具体地说,我创建了一个3D张量:
array = np.repeat(np.arange(15).reshape(3,5)[None,:],3, axis = 0)
# array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]],
[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]],
[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]]])
我想要移除第一个矩阵的第一行,第二个矩阵的第二行,依此类推,以便得到:
# array([[[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]],
[[ 0, 1, 2, 3, 4],
[10, 11, 12, 13, 14]],
[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9]]])
我要说明的是,我希望使用以下形式的代码来实现:
array[mask]
(我不想重新整形或选择索引和mask.array以获得更好的性能)。
我创建了一个掩码:
mask = [[False, True, True],
[True, False, True],
[False, True, True]]
但它丢失了一个维度:
array[mask]
array([[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[ 0, 1, 2, 3, 4],
[10, 11, 12, 13, 14],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
非常感谢您的帮助!
英文:
I struggle with mask and numpy. I have a 3D-tensor and I try to select only some specific cells.
More preciseley I create a 3D tensor :
array = np.repeat(np.arange(15).reshape(3,5)[None,:],3, axis = 0)
# array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]],
[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]],
[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]]])
and I want to have remove the first row of the first matrix, the second row of the second matrix etc... So in order to have :
# array([[[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]],
[[ 0, 1, 2, 3, 4],
[10, 11, 12, 13, 14]],
[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9]]])
I precise that I want to do it with a code of this form :
array[mask]
(I don't want to reshape or select indexes and mask.array for its performance).
I created a mask :
mask = [[False ,True ,True],
[True ,False ,True],
[False, True, True]]
But it lost a dimension :
array[mask]
array([[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[ 0, 1, 2, 3, 4],
[10, 11, 12, 13, 14],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
Thank you very much for the help !
答案1
得分: 1
抱歉,我无法识别你要翻译的代码部分。以下是已经翻译好的非代码内容:
不幸的是,我认为使用布尔掩码无法实现你想要的目标。你所看到的行为在numpy索引指南中有描述。特别是:
> 一般来说,如果索引包括布尔数组,结果将等同于在相同位置插入 obj.nonzero()
并使用上面描述的整数数组索引机制。x[ind_1, boolean_array, ind_2]
等同于 x[(ind_1,) + boolean_array.nonzero() + (ind_2,)]
。
在你的情况下,np.nonzero(mask)
会产生两个一维数组,这意味着你的掩码索引了array
的前两个维度,这两个维度被替换为一个单一维度(因为np.nonzero(mask)
的结果会广播到一个一维数组)。
你可以通过使用两个二维数组索引数据array
的前两个维度来实现你想要的效果,但我不确定这是否符合你对“选择索引”的定义。以下是一个示例:
import numpy as np
array = np.repeat(np.arange(15).reshape(3,5)[None,:],3, axis = 0)
# 元组,包含两个形状为(3, 1)和(3, 2)的数组,这两个数组会广播到(3, 2)的子空间
idx = [[0], [1], [2]], [[1, 2], [0, 2], [0, 1]]
out = array[idx]
输出结果:
array([[[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]],
[[ 0, 1, 2, 3, 4],
[10, 11, 12, 13, 14]],
[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9]]])
英文:
Unfortunately, I don't think it's possible to achieve what you want using boolean masks. The behavior you're seeing is described in numpy's indexing guide. In particular:
> In general if an index includes a Boolean array, the result will be identical to inserting obj.nonzero()
into the same position and using the integer array indexing mechanism described above. x[ind_1, boolean_array, ind_2]
is equivalent to x[(ind_1,) + boolean_array.nonzero() + (ind_2,)]
.
In your case, np.nonzero(mask)
would produce two 1 dimensional arrays, which means your mask indexes the first two dimensions of array
which is replaced by a single dimension (since the np.nonzero(mask)
results broadcast to a 1-D array).
You could achieve what you want by indexing the first two dimensions of the data array
with two 2-d arrays, but I'm not sure whether that falls under your definition of not wanting to "select indexes". Here is an example:
import numpy as np
array = np.repeat(np.arange(15).reshape(3,5)[None,:],3, axis = 0)
# Tuple of (3, 1) and (3, 2) shaped array which broadcasts to (3, 2) subspace
idx = [[0], [1], [2]], [[1, 2], [0, 2], [0, 1]]
out = array[idx]
out:
array([[[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]],
[[ 0, 1, 2, 3, 4],
[10, 11, 12, 13, 14]],
[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9]]])
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论