numpy array.all() solution for multidimensional array where array.all(axis=1).all(axis=1) gives desired result

huangapple go评论107阅读模式
英文:

numpy array.all() solution for multidimensional array where array.all(axis=1).all(axis=1) gives desired result

问题

Sure, here's the translation of the code part:

我有一个多维度的类似 NumPy 的数组我正在使用 Dask但这也适用于 NumPy因为 Dask 模仿了其 API),它是从包含 1592 张图像的数组派生的

`a`:
```python
array([[[ True,  True,  True, ...,  True,  True,  True],
        [ True,  True,  True, ...,  True,  True,  True],
        [ True,  True,  True, ...,  True,  True,  True],
        ...,
        [ True,  True,  True, ...,  True,  True,  True],
        [ True,  True,  True, ...,  True,  True,  True],
        [ True,  True,  True, ...,  True,  True,  True]],

       ...,

       [[ True,  True,  True, ...,  True,  True,  True],
        [ True,  True,  True, ...,  True,  True,  True],
        [ True,  True,  True, ...,  True,  True,  True],
        ...,
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False],
        [False, False, False, ..., False, False, False]]])

我想保留具有 False 条目的图像,并摆脱全部为 True 的图像。我可以使用 array.all() 来实现:

mask = a.all(axis=1).all(axis=1)
retain = np.where(mask==False,filenames,None)
# 将 `retain` 写入一个文件,以供另一个脚本读取

其中 filenames 是我的文件路径列表。

然而,我认为 a.all(axis=1).all(axis=1) 不太令人满意。在我的看法中,这看起来像是我对数组运行了两次,当一次应该足够。但是我对吗?

注意:a.all(axis=1) 给出:

array([[ True,  True,  True,  True,  True,  True,  True,  True,  True,
         True,  True,  True,  True,  True,  True,  True,  True,  True,
         True,  True],

       ...,

       [False, False, False, False, False, False, False, False, False,
        False, False, False, False, False, False, False, False, False,
        False, False]])

a.all(axis=1).all(axis=1) 给出:

array([ True, False, False, False,  True,  True, False, False, False,

        ...,

       False, False, False,  True, False, False, False])

对于这个示例,我能否更有效地从三维数据转换为一维数据?


<details>
<summary>英文:</summary>

I have a multidimensional NumPy-like array, (I&#39;m using Dask, but this applies to NumPy as Dask mimics that API) that derives from an array of 1592 images:

`a`:

array([[[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True],
...,
[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True]],

   ...,

   [[ True,  True,  True, ...,  True,  True,  True],
    [ True,  True,  True, ...,  True,  True,  True],
    [ True,  True,  True, ...,  True,  True,  True],
    ...,
    [False, False, False, ..., False, False, False],
    [False, False, False, ..., False, False, False],
    [False, False, False, ..., False, False, False]]])

I want to retain images where the masks have `False` entries and get rid of images that are _all_ `True`. I can do this with `array.all()` as:

mask = a.all(axis=1).all(axis=1)
retain = np.where(mask==False,filenames,None)
#write retain to a file to be read by another script

where `filenames` is my list of file paths.

However, I don&#39;t find `a.all(axis=1).all(axis=1)` very satisfactory. This looks to me like I am running over the array twice, when once should be enough. But am I?

Note: `a.all(axis=1)` gives:

array([[ True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True],

   ...,

   [False, False, False, False, False, False, False, False, False,
    False, False, False, False, False, False, False, False, False,
    False, False]])
and `a.all(axis=1).all(axis=1)` gives:

array([ True, False, False, False, True, True, False, False, False,

    ...,

   False, False, False,  True, False, False, False])

Can I go from 3-dimensional data to 1-dimensional data more efficiently for this example?

</details>


# 答案1
**得分**: 3

```python
mask = a.all(axis=(1,2))

这段代码可以完成任务吗?它比你目前正在做的要快。请注意,虽然你要两次遍历数组,但第二次数组会更短。你实际上正在执行以下操作:

b = a.all(axis=1)
mask = b.all(axis=1)

所以第二次你遍历了一个较短的数组。


PS:你可以简化你的代码如下:

mask = a.all(axis=(1,2))
retain = filenames[~mask]
英文:

Does

mask = a.all(axis=(1,2))

do the job? It is faster than what you are currently doing. Note that while you are going over an array twice, the second time it is shorter. You are effectively doing

b = a.all(axis=1)
mask = b.all(axis=1)

so the second time you go over a shorter array.

<hr>

PS: you can simplify your code as follows.

mask = a.all(axis=(1,2))
retain = filenames[~mask]

huangapple
  • 本文由 发表于 2023年6月12日 22:54:52
  • 转载请务必保留本文链接:https://go.coder-hub.com/76457873.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定