根据特定值的索引筛选numpy数组

huangapple go评论120阅读模式
英文:

Filtering numpy arrays based on the index of certain value

问题

我有一个类似的numpy数组:

  1. array([[ 1, 17, 33, ..., 28, 9, 22],
  2. [ 3, 11, 1, ..., 25, 45, 14],
  3. [ 3, 11, 1, ..., 21, 23, 5],
  4. ...,
  5. [20, 6, 27, ..., 43, 15, 14],
  6. [27, 6, 39, ..., 37, 17, 2],
  7. [ 3, 11, 8, ..., 27, 35, 32]], dtype=int32)

从这里,我想要筛选出在索引10或之前出现值4的行。

例如:
[1, 2, 3, 4, ..., 34, 35] - 筛选,因为值4在索引3之前出现,即在索引10之前。

例如:
[35, 34, 33, 32..., 4, 3, 2, 1] - 保留,因为值4在索引10之后出现。

使用numpy掩码,可以实现这种筛选的方式是什么?

英文:

I have a numpy array like:

  1. array([[ 1, 17, 33, ..., 28, 9, 22],
  2. [ 3, 11, 1, ..., 25, 45, 14],
  3. [ 3, 11, 1, ..., 21, 23, 5],
  4. ...,
  5. [20, 6, 27, ..., 43, 15, 14],
  6. [27, 6, 39, ..., 37, 17, 2],
  7. [ 3, 11, 8, ..., 27, 35, 32]], dtype=int32)

From here, I would like to filter out rows where value 4 is occurring at index of 10 or before.

e.g.
[1, 2, 3, 4, ..., 34, 35] - filter, as value 4 is occurring at index3, which is before index 10

e.g.
[35, 34, 33, 32..., 4, 3, 2, 1] - keep, as value 4 is occurring after index10.

what would be the way to achieve this filtering using numpy masking?

答案1

得分: 2

这是您要翻译的代码部分:

  1. arr = np.array(
  2. [[1, 2, 3, 4, 34, 35, 2, 1],
  3. [1, 2, 3, 5, 6, 7, 8, 9],
  4. [1, 2, 4, 5, 6, 7, 8, 9],
  5. [4, 2, 3, 5, 6, 7, 8, 9],
  6. [35, 34, 33, 32, 4, 3, 2, 1]]
  7. )
  8. V, T = 4, 3 # <-- change the threshold to 10
  9. m = np.any(arr[:, :T+1] == V, axis=1)
  10. out = arr[~m]
  11. Output :
  12. print(out)
  13. array([[ 1, 2, 3, 5, 6, 7, 8, 9],
  14. [35, 34, 33, 32, 4, 3, 2, 1]])

Intermediates :

  1. >>> arr[:, :T+1]
  2. array([[ 1, 2, 3, 4],
  3. [ 1, 2, 3, 5],
  4. [ 1, 2, 4, 5],
  5. [ 4, 2, 3, 5],
  6. [35, 34, 33, 32]])
  7. >>> arr[:, :T+1] == V
  8. array([[False, False, False, True],
  9. [False, False, False, False],
  10. [False, False, True, False],
  11. [ True, False, False, False],
  12. [False, False, False, False]])
  13. >>> np.any(arr[:, :T+1] == V, axis=1)
  14. array([ True, False, True, True, False])

请注意,代码部分已经被复制到翻译中。

英文:

You can try this :

  1. arr = np.array(
  2. [[1, 2, 3, 4, 34, 35, 2, 1],
  3. [1, 2, 3, 5, 6, 7, 8, 9],
  4. [1, 2, 4, 5, 6, 7, 8, 9],
  5. [4, 2, 3, 5, 6, 7, 8, 9],
  6. [35, 34, 33, 32, 4, 3, 2, 1]]
  7. )
  8. V, T = 4, 3 # <-- change the threshold to 10
  9. m = np.any(arr[:, :T+1] == V, axis=1)
  10. out = arr[~m]

Output :

  1. print(out)
  2. array([[ 1, 2, 3, 5, 6, 7, 8, 9],
  3. [35, 34, 33, 32, 4, 3, 2, 1]])

Intermediates :

  1. >>> arr[:, :T+1]
  2. array([[ 1, 2, 3, 4],
  3. [ 1, 2, 3, 5],
  4. [ 1, 2, 4, 5],
  5. [ 4, 2, 3, 5],
  6. [35, 34, 33, 32]])
  7. >>> arr[:, :T+1] == V
  8. array([[False, False, False, True],
  9. [False, False, False, False],
  10. [False, False, True, False],
  11. [ True, False, False, False],
  12. [False, False, False, False]])
  13. >>> np.any(arr[:, :T+1] == V, axis=1)
  14. array([ True, False, True, True, False])

答案2

得分: -1

以下是翻译后的代码部分:

  1. import numpy as np
  2. np.random.seed(0)
  3. vals = np.random.choice(50, size=(10, 100))
  4. print(vals[:,:10])
  5. print(vals.shape)
  6. vals = vals[[4 not in i for i in vals[:,:10]]]
  7. print(vals.shape)
  8. print(vals[:,:10])

应该输出:

  1. [[44 47 0 3 3 39 9 19 21 36]
  2. [ 5 41 35 0 31 5 30 0 49 36]
  3. [24 15 41 18 40 15 11 38 47 29]
  4. [ 2 5 37 12 44 2 47 27 21 39]
  5. [42 48 30 16 26 35 49 42 9 44]
  6. [ 3 34 40 33 28 4 26 32 45 9]
  7. [41 38 43 18 7 28 1 41 2 28]
  8. [36 49 24 33 18 33 14 49 7 43]
  9. [ 6 27 35 6 19 34 38 20 43 0]
  10. [ 9 20 37 48 17 9 44 15 38 14]]
  11. (10, 100)
  12. (9, 100)
  13. [[44 47 0 3 3 39 9 19 21 36]
  14. [ 5 41 35 0 31 5 30 0 49 36]
  15. [24 15 41 18 40 15 11 38 47 29]
  16. [ 2 5 37 12 44 2 47 27 21 39]
  17. [42 48 30 16 26 35 49 42 9 44]
  18. [41 38 43 18 7 28 1 41 2 28]
  19. [36 49 24 33 18 33 14 49 7 43]
  20. [ 6 27 35 6 19 34 38 20 43 0]
  21. [ 9 20 37 48 17 9 44 15 38 14]]
英文:

Here's one way to do it, the key bit is [4 not in i for i in vals[:,:10]] which creates a mask by iterating through a view of the first 10 elements of each row.

  1. import numpy as np
  2. np.random.seed(0)
  3. vals = np.random.choice(50, size=(10, 100))
  4. print(vals[:,:10])
  5. print(vals.shape)
  6. vals = vals[[4 not in i for i in vals[:,:10]]]
  7. print(vals.shape)
  8. print(vals[:,:10])

which should yield:

  1. [[44 47 0 3 3 39 9 19 21 36]
  2. [ 5 41 35 0 31 5 30 0 49 36]
  3. [24 15 41 18 40 15 11 38 47 29]
  4. [ 2 5 37 12 44 2 47 27 21 39]
  5. [42 48 30 16 26 35 49 42 9 44]
  6. [ 3 34 40 33 28 4 26 32 45 9]
  7. [41 38 43 18 7 28 1 41 2 28]
  8. [36 49 24 33 18 33 14 49 7 43]
  9. [ 6 27 35 6 19 34 38 20 43 0]
  10. [ 9 20 37 48 17 9 44 15 38 14]]
  11. (10, 100)
  12. (9, 100)
  13. [[44 47 0 3 3 39 9 19 21 36]
  14. [ 5 41 35 0 31 5 30 0 49 36]
  15. [24 15 41 18 40 15 11 38 47 29]
  16. [ 2 5 37 12 44 2 47 27 21 39]
  17. [42 48 30 16 26 35 49 42 9 44]
  18. [41 38 43 18 7 28 1 41 2 28]
  19. [36 49 24 33 18 33 14 49 7 43]
  20. [ 6 27 35 6 19 34 38 20 43 0]
  21. [ 9 20 37 48 17 9 44 15 38 14]]

huangapple
  • 本文由 发表于 2023年6月19日 00:56:44
  • 转载请务必保留本文链接:https://go.coder-hub.com/76501649.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定