根据特定值的索引筛选numpy数组

huangapple go评论87阅读模式
英文:

Filtering numpy arrays based on the index of certain value

问题

我有一个类似的numpy数组:

array([[ 1, 17, 33, ..., 28,  9, 22],
       [ 3, 11,  1, ..., 25, 45, 14],
       [ 3, 11,  1, ..., 21, 23,  5],
       ...,
       [20,  6, 27, ..., 43, 15, 14],
       [27,  6, 39, ..., 37, 17,  2],
       [ 3, 11,  8, ..., 27, 35, 32]], dtype=int32)

从这里,我想要筛选出在索引10或之前出现值4的行。

例如:
[1, 2, 3, 4, ..., 34, 35] - 筛选,因为值4在索引3之前出现,即在索引10之前。

例如:
[35, 34, 33, 32..., 4, 3, 2, 1] - 保留,因为值4在索引10之后出现。

使用numpy掩码,可以实现这种筛选的方式是什么?

英文:

I have a numpy array like:

array([[ 1, 17, 33, ..., 28,  9, 22],
       [ 3, 11,  1, ..., 25, 45, 14],
       [ 3, 11,  1, ..., 21, 23,  5],
       ...,
       [20,  6, 27, ..., 43, 15, 14],
       [27,  6, 39, ..., 37, 17,  2],
       [ 3, 11,  8, ..., 27, 35, 32]], dtype=int32)

From here, I would like to filter out rows where value 4 is occurring at index of 10 or before.

e.g.
[1, 2, 3, 4, ..., 34, 35] - filter, as value 4 is occurring at index3, which is before index 10

e.g.
[35, 34, 33, 32..., 4, 3, 2, 1] - keep, as value 4 is occurring after index10.

what would be the way to achieve this filtering using numpy masking?

答案1

得分: 2

这是您要翻译的代码部分:

arr = np.array(
    [[1, 2, 3, 4, 34, 35, 2, 1],
     [1, 2, 3, 5, 6, 7, 8, 9],
     [1, 2, 4, 5, 6, 7, 8, 9],
     [4, 2, 3, 5, 6, 7, 8, 9],
     [35, 34, 33, 32, 4, 3, 2, 1]]
)

V, T = 4, 3 # <-- change the threshold to 10

m = np.any(arr[:, :T+1] == V, axis=1)

out = arr[~m]

Output :

print(out)

array([[ 1,  2,  3,  5,  6,  7,  8,  9],
       [35, 34, 33, 32,  4,  3,  2,  1]])

Intermediates :

>>> arr[:, :T+1]
array([[ 1,  2,  3,  4],
       [ 1,  2,  3,  5],
       [ 1,  2,  4,  5],
       [ 4,  2,  3,  5],
       [35, 34, 33, 32]])
    
>>> arr[:, :T+1] == V
array([[False, False, False,  True],
       [False, False, False, False],
       [False, False,  True, False],
       [ True, False, False, False],
       [False, False, False, False]])
    
>>> np.any(arr[:, :T+1] == V, axis=1)
array([ True, False,  True,  True, False])

请注意,代码部分已经被复制到翻译中。

英文:

You can try this :

arr = np.array(
    [[1, 2, 3, 4, 34, 35, 2, 1],
     [1, 2, 3, 5, 6, 7, 8, 9],
     [1, 2, 4, 5, 6, 7, 8, 9],
     [4, 2, 3, 5, 6, 7, 8, 9],
     [35, 34, 33, 32, 4, 3, 2, 1]]
)

V, T = 4, 3 # <-- change the threshold to 10

m = np.any(arr[:, :T+1] == V, axis=1)

out = arr[~m]

Output :

print(out)

array([[ 1,  2,  3,  5,  6,  7,  8,  9],
       [35, 34, 33, 32,  4,  3,  2,  1]])

Intermediates :

>>> arr[:, :T+1]
array([[ 1,  2,  3,  4],
       [ 1,  2,  3,  5],
       [ 1,  2,  4,  5],
       [ 4,  2,  3,  5],
       [35, 34, 33, 32]])

>>> arr[:, :T+1] == V
array([[False, False, False,  True],
       [False, False, False, False],
       [False, False,  True, False],
       [ True, False, False, False],
       [False, False, False, False]])

>>> np.any(arr[:, :T+1] == V, axis=1)
array([ True, False,  True,  True, False])

答案2

得分: -1

以下是翻译后的代码部分:

import numpy as np

np.random.seed(0)

vals = np.random.choice(50, size=(10, 100))

print(vals[:,:10])
print(vals.shape)
vals = vals[[4 not in i for i in vals[:,:10]]]
print(vals.shape)
print(vals[:,:10])

应该输出:

[[44 47  0  3  3 39  9 19 21 36]
 [ 5 41 35  0 31  5 30  0 49 36]
 [24 15 41 18 40 15 11 38 47 29]
 [ 2  5 37 12 44  2 47 27 21 39]
 [42 48 30 16 26 35 49 42  9 44]
 [ 3 34 40 33 28  4 26 32 45  9]
 [41 38 43 18  7 28  1 41  2 28]
 [36 49 24 33 18 33 14 49  7 43]
 [ 6 27 35  6 19 34 38 20 43  0]
 [ 9 20 37 48 17  9 44 15 38 14]]
(10, 100)
(9, 100)
[[44 47  0  3  3 39  9 19 21 36]
 [ 5 41 35  0 31  5 30  0 49 36]
 [24 15 41 18 40 15 11 38 47 29]
 [ 2  5 37 12 44  2 47 27 21 39]
 [42 48 30 16 26 35 49 42  9 44]
 [41 38 43 18  7 28  1 41  2 28]
 [36 49 24 33 18 33 14 49  7 43]
 [ 6 27 35  6 19 34 38 20 43  0]
 [ 9 20 37 48 17  9 44 15 38 14]]
英文:

Here's one way to do it, the key bit is [4 not in i for i in vals[:,:10]] which creates a mask by iterating through a view of the first 10 elements of each row.

import numpy as np
np.random.seed(0)
vals = np.random.choice(50, size=(10, 100))
print(vals[:,:10])
print(vals.shape)
vals = vals[[4 not in i for i in vals[:,:10]]]
print(vals.shape)
print(vals[:,:10])

which should yield:

[[44 47  0  3  3 39  9 19 21 36]
[ 5 41 35  0 31  5 30  0 49 36]
[24 15 41 18 40 15 11 38 47 29]
[ 2  5 37 12 44  2 47 27 21 39]
[42 48 30 16 26 35 49 42  9 44]
[ 3 34 40 33 28  4 26 32 45  9]
[41 38 43 18  7 28  1 41  2 28]
[36 49 24 33 18 33 14 49  7 43]
[ 6 27 35  6 19 34 38 20 43  0]
[ 9 20 37 48 17  9 44 15 38 14]]
(10, 100)
(9, 100)
[[44 47  0  3  3 39  9 19 21 36]
[ 5 41 35  0 31  5 30  0 49 36]
[24 15 41 18 40 15 11 38 47 29]
[ 2  5 37 12 44  2 47 27 21 39]
[42 48 30 16 26 35 49 42  9 44]
[41 38 43 18  7 28  1 41  2 28]
[36 49 24 33 18 33 14 49  7 43]
[ 6 27 35  6 19 34 38 20 43  0]
[ 9 20 37 48 17  9 44 15 38 14]]

huangapple
  • 本文由 发表于 2023年6月19日 00:56:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76501649.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定