英文:
numpy array.all() solution for multidimensional array where array.all(axis=1).all(axis=1) gives desired result
问题
Sure, here's the translation of the code part:
我有一个多维度的类似 NumPy 的数组(我正在使用 Dask,但这也适用于 NumPy,因为 Dask 模仿了其 API),它是从包含 1592 张图像的数组派生的:
`a`:
```python
array([[[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True],
...,
[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True]],
...,
[[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]]])
我想保留具有 False
条目的图像,并摆脱全部为 True
的图像。我可以使用 array.all()
来实现:
mask = a.all(axis=1).all(axis=1)
retain = np.where(mask==False,filenames,None)
# 将 `retain` 写入一个文件,以供另一个脚本读取
其中 filenames
是我的文件路径列表。
然而,我认为 a.all(axis=1).all(axis=1)
不太令人满意。在我的看法中,这看起来像是我对数组运行了两次,当一次应该足够。但是我对吗?
注意:a.all(axis=1)
给出:
array([[ True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True],
...,
[False, False, False, False, False, False, False, False, False,
False, False, False, False, False, False, False, False, False,
False, False]])
而 a.all(axis=1).all(axis=1)
给出:
array([ True, False, False, False, True, True, False, False, False,
...,
False, False, False, True, False, False, False])
对于这个示例,我能否更有效地从三维数据转换为一维数据?
<details>
<summary>英文:</summary>
I have a multidimensional NumPy-like array, (I'm using Dask, but this applies to NumPy as Dask mimics that API) that derives from an array of 1592 images:
`a`:
array([[[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True],
...,
[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True]],
...,
[[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True],
[ True, True, True, ..., True, True, True],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]]])
I want to retain images where the masks have `False` entries and get rid of images that are _all_ `True`. I can do this with `array.all()` as:
mask = a.all(axis=1).all(axis=1)
retain = np.where(mask==False,filenames,None)
#write retain
to a file to be read by another script
where `filenames` is my list of file paths.
However, I don't find `a.all(axis=1).all(axis=1)` very satisfactory. This looks to me like I am running over the array twice, when once should be enough. But am I?
Note: `a.all(axis=1)` gives:
array([[ True, True, True, True, True, True, True, True, True,
True, True, True, True, True, True, True, True, True,
True, True],
...,
[False, False, False, False, False, False, False, False, False,
False, False, False, False, False, False, False, False, False,
False, False]])
and `a.all(axis=1).all(axis=1)` gives:
array([ True, False, False, False, True, True, False, False, False,
...,
False, False, False, True, False, False, False])
Can I go from 3-dimensional data to 1-dimensional data more efficiently for this example?
</details>
# 答案1
**得分**: 3
```python
mask = a.all(axis=(1,2))
这段代码可以完成任务吗?它比你目前正在做的要快。请注意,虽然你要两次遍历数组,但第二次数组会更短。你实际上正在执行以下操作:
b = a.all(axis=1)
mask = b.all(axis=1)
所以第二次你遍历了一个较短的数组。
PS:你可以简化你的代码如下:
mask = a.all(axis=(1,2))
retain = filenames[~mask]
英文:
Does
mask = a.all(axis=(1,2))
do the job? It is faster than what you are currently doing. Note that while you are going over an array twice, the second time it is shorter. You are effectively doing
b = a.all(axis=1)
mask = b.all(axis=1)
so the second time you go over a shorter array.
<hr>
PS: you can simplify your code as follows.
mask = a.all(axis=(1,2))
retain = filenames[~mask]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论