在NumPy数组中有条件地替换列。

huangapple go评论99阅读模式
英文:

Conditional replacement of column in numpy array

问题

你可以使用NumPy的函数来实现这个灵活的功能。以下是一个示例代码,可以根据指定的轴来过滤数组:

  1. import numpy as np
  2. def filter_array(arr, axis=2):
  3. nonzero_counts = np.count_nonzero(arr, axis=axis)
  4. mask = nonzero_counts <= 1
  5. return arr * mask[..., np.newaxis]
  6. # 对列进行过滤
  7. filtered_columns = filter_array(arr, axis=2)
  8. print(filtered_columns)
  9. # 对行进行过滤
  10. filtered_rows = filter_array(arr, axis=1)
  11. print(filtered_rows)

这个函数可以根据传入的轴参数来过滤数组的行或列,使代码更加灵活和可复用。

英文:

I`m currently stuck on writing some script in numpy, which main goal is to be efficient (so, vectorization is mandatory).

Let`s assume 3-d array:

  1. arr = [[[0, 0, 0, 0],
  2. [0, 0, 3, 4],
  3. [0, 0, 3, 0],
  4. [0, 2, 3, 0]],
  5. [[0, 0, 3, 0],
  6. [0, 0, 0, 0],
  7. [1, 0, 3, 0],
  8. [0, 0, 0, 0]],
  9. [[0, 2, 3, 4],
  10. [0, 0, 0, 0],
  11. [0, 0, 3, 4],
  12. [0, 0, 3, 0]],
  13. [[0, 0, 3, 4],
  14. [0, 0, 3, 4],
  15. [0, 0, 0, 0],
  16. [0, 0, 0, 0]]]

My goal is to set to dismiss every column which have more than one number other than zero. So, having above matrix the result should be something like:

  1. filtered = [[[0, 0, 0, 0],
  2. [0, 0, 0, 4],
  3. [0, 0, 0, 0],
  4. [0, 2, 0, 0]],
  5. [[0, 0, 0, 0],
  6. [0, 0, 0, 0],
  7. [1, 0, 0, 0],
  8. [0, 0, 0, 0]],
  9. [[0, 2, 0, 0],
  10. [0, 0, 0, 0],
  11. [0, 0, 0, 0],
  12. [0, 0, 0, 0]],
  13. [[0, 0, 0, 0],
  14. [0, 0, 0, 0],
  15. [0, 0, 0, 0],
  16. [0, 0, 0, 0]]]

I`ve managed to work this around by set of np.count_nonzero, np.repeat and reshape:

  1. indices = np.repeat(np.count_nonzero(a=arr, axis=1), repeats=4, axis=0).reshape(4, 4, 4)
  2. result = indices * a

Which produces good results but looks like missing the point (there is a lot of cryptic matrix shape manipulation only to slice array properly). Furthermore, I`d wish this function to be flexible enough to work out with other axes too (for rows e.g.), resulting:

  1. rows_fil = [[[0, 0, 0, 0],
  2. [0, 0, 0, 0],
  3. [0, 0, 3, 0],
  4. [0, 0, 0, 0]],
  5. [[0, 0, 3, 0],
  6. [0, 0, 0, 0],
  7. [0, 0, 0, 0],
  8. [0, 0, 0, 0]],
  9. [[0, 0, 0, 0],
  10. [0, 0, 0, 0],
  11. [0, 0, 0, 0],
  12. [0, 0, 3, 0]],
  13. [[0, 0, 0, 0],
  14. [0, 0, 0, 0],
  15. [0, 0, 0, 0],
  16. [0, 0, 0, 0]]

Is there any "numpy" way to achieve such flexible function?

答案1

得分: 1

以下是已翻译的内容:

这里有一个涵盖通用轴参数的解决方案 -

  1. def mask_nnzcount(a, axis):
  2. # a是输入数组
  3. mask = (a != 0).sum(axis=axis, keepdims=True) > 1
  4. return np.where(mask, 0, a)

关键在于 keepdims = True,它允许我们拥有一个通用的解决方案。

对于一个3D数组,对于列填充,使用 axis=1,对于行填充,使用 axis=2

对于通用的ndarray,您可能想要使用 axis=-2 进行列填充,使用 axis=-1 进行行填充。

或者,我们还可以在最后一步使用元素级别的乘法来获得输出,即 a*(~mask)。或者获得一个反转的掩码,即说 inv_mask = (a != 0).sum(axis=axis, keepdims=True) <= 1,然后执行 a*inv_mask

英文:

Here's a solution to cover a generic axis param -

  1. def mask_nnzcount(a, axis):
  2. # a is input array
  3. mask = (a!=0).sum(axis=axis, keepdims=True)&gt;1
  4. return np.where(mask, 0, a)

The trick really is at keepdims = True which allows us to have a generic solution.

With a 3D array, for your column-fill, that's with axis=1 and for row-fill it's axis=2.

For a generic ndarray, you might want to use axis=-2 for column-fill and axis=-1 for row-fill.

Alternatively, we could also use element-wise multiplication instead at the last step to get the output with a*(~mask). Or get an inverted mask i.e. say inv_mask = (a!=0).sum(axis=axis, keepdims=True)&lt;=1 and then do a*inv_mask.

huangapple
  • 本文由 发表于 2020年1月3日 17:38:05
  • 转载请务必保留本文链接:https://go.coder-hub.com/59576146.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定