Numpy掩码:维度丢失

huangapple go评论62阅读模式
英文:

Numpy mask : dimension loss

问题

我遇到了关于mask和numpy的问题。我有一个3D张量,尝试仅选择其中的特定单元。

更具体地说,我创建了一个3D张量:

array = np.repeat(np.arange(15).reshape(3,5)[None,:],3, axis = 0)
# array([[[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]],

       [[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]],

       [[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]]])

我想要移除第一个矩阵的第一行,第二个矩阵的第二行,依此类推,以便得到:

# array([[[ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]],

       [[ 0,  1,  2,  3,  4],
        [10, 11, 12, 13, 14]],

       [[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9]]])

我要说明的是,我希望使用以下形式的代码来实现:

array[mask]

(我不想重新整形或选择索引和mask.array以获得更好的性能)。

我创建了一个掩码:

mask = [[False, True, True],
       [True, False, True],
       [False, True, True]]

但它丢失了一个维度:

array[mask]
array([[ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

非常感谢您的帮助!

英文:

I struggle with mask and numpy. I have a 3D-tensor and I try to select only some specific cells.

More preciseley I create a 3D tensor :

array = np.repeat(np.arange(15).reshape(3,5)[None,:],3, axis = 0)
# array([[[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]],

       [[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]],

       [[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]]])

and I want to have remove the first row of the first matrix, the second row of the second matrix etc... So in order to have :

# array([[[ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]],

       [[ 0,  1,  2,  3,  4],
        [10, 11, 12, 13, 14]],

       [[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9]]])

I precise that I want to do it with a code of this form :
array[mask]

(I don't want to reshape or select indexes and mask.array for its performance).

I created a mask :

mask = [[False ,True ,True],
       [True ,False ,True],
       [False, True, True]]

But it lost a dimension :

array[mask]
array([[ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

Thank you very much for the help !

答案1

得分: 1

抱歉,我无法识别你要翻译的代码部分。以下是已经翻译好的非代码内容:

不幸的是,我认为使用布尔掩码无法实现你想要的目标。你所看到的行为在numpy索引指南中有描述。特别是:

> 一般来说,如果索引包括布尔数组,结果将等同于在相同位置插入 obj.nonzero() 并使用上面描述的整数数组索引机制。x[ind_1, boolean_array, ind_2] 等同于 x[(ind_1,) + boolean_array.nonzero() + (ind_2,)]

在你的情况下,np.nonzero(mask) 会产生两个一维数组,这意味着你的掩码索引了array的前两个维度,这两个维度被替换为一个单一维度(因为np.nonzero(mask) 的结果会广播到一个一维数组)。

你可以通过使用两个二维数组索引数据array的前两个维度来实现你想要的效果,但我不确定这是否符合你对“选择索引”的定义。以下是一个示例:

import numpy as np

array = np.repeat(np.arange(15).reshape(3,5)[None,:],3, axis = 0)

# 元组,包含两个形状为(3, 1)和(3, 2)的数组,这两个数组会广播到(3, 2)的子空间
idx = [[0], [1], [2]], [[1, 2], [0, 2], [0, 1]]

out = array[idx]

输出结果:

array([[[ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]],

       [[ 0,  1,  2,  3,  4],
        [10, 11, 12, 13, 14]],

       [[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9]]])
英文:

Unfortunately, I don't think it's possible to achieve what you want using boolean masks. The behavior you're seeing is described in numpy's indexing guide. In particular:

> In general if an index includes a Boolean array, the result will be identical to inserting obj.nonzero() into the same position and using the integer array indexing mechanism described above. x[ind_1, boolean_array, ind_2] is equivalent to x[(ind_1,) + boolean_array.nonzero() + (ind_2,)].

In your case, np.nonzero(mask) would produce two 1 dimensional arrays, which means your mask indexes the first two dimensions of array which is replaced by a single dimension (since the np.nonzero(mask) results broadcast to a 1-D array).

You could achieve what you want by indexing the first two dimensions of the data array with two 2-d arrays, but I'm not sure whether that falls under your definition of not wanting to "select indexes". Here is an example:

import numpy as np

array = np.repeat(np.arange(15).reshape(3,5)[None,:],3, axis = 0)

# Tuple of (3, 1) and (3, 2) shaped array which broadcasts to (3, 2) subspace
idx = [[0], [1], [2]], [[1, 2], [0, 2], [0, 1]]

out = array[idx]

out:

array([[[ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]],

       [[ 0,  1,  2,  3,  4],
        [10, 11, 12, 13, 14]],

       [[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9]]])

huangapple
  • 本文由 发表于 2023年3月3日 22:12:19
  • 转载请务必保留本文链接:https://go.coder-hub.com/75628161.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定