2023年6月26日 22:59:46go评论158阅读模式

英文:

Find indices of array of values in a master array, when values are arrays

问题

我有一个(N,K)维的numpy.ndarray "master"，我想要找到几个K维数组的出现次数，假设我有M个，它们存储在一个(M,K)维数组 "search" 中。
假设例如 N=9, K=2, M=3 并且
```python
master = array([[0, 0], [1, 0], [2, 0], [0, 1], [1, 1], [2, 1], [0, 2], [1, 2], [2, 2]])
search = array([[1, 2], [2, 0], [4, -2]])

我想要的是类似于 array([7, 2]) 或者 array([2, 7])，因为主数组的索引2和7在搜索数组中出现。

我首先尝试使用 np.isin，写成
np.argwhere(np.all(np.isin(master, search), axis=1)).ravel()
但这返回了所有值都属于“search”的索引，但不一定属于同一个元素...

这另一种方法似乎可以工作，但它使用了Python的列表推导式和嵌套循环，所以我认为这是非常低效的：

np.argwhere(np.any(np.array([[np.array_equal(master[i], search[j])
                               for i in range(N)]
                              for j in range(M)]),
                   axis=0)).ravel()

是否有一种方法只使用NumPy的标准函数来做到这一点？我的输入数据非常大，因此理解列表的速度太慢了...


<details>
<summary>英文:</summary>
I have a (N,K)-dimensional numpy.ndarray &quot;master&quot;, in which I want to find the occurences of several K-dimensional arrays, let&#39;s say I have M of them, stored in a (M,K)-dimensional array &quot;search&quot;.
Let&#39;s say for example N=9, K=2, M=3 and

master = array([[0, 0], [1, 0], [2, 0], [0, 1], [1, 1], [2, 1], [0, 2], [1, 2], [2, 2]])
search = array([[1, 2], [2, 0], [4, -2]])


Here what I would like is something like `array([7, 2])` or `array([2, 7])`, because indices 2 and 7 of the master array appear in the search array.
I first tried to use `np.isin` by writing
`np.argwhere(np.all(np.isin(master, search), axis=1)).ravel()`
But this returned the indices where all the values belonged to `search` but not necessarily to the same element...
This other approach seems to work but it uses Python&#39;s list comprehension with nested loops so I think this is very sub-optimal :
```python
np.argwhere(np.any(np.array([[np.array_equal(master[i], search[j])
                               for i in range(N)]
                              for j in range(M)]),
                   axis=0)).ravel()

Is there a way to do so by only using Numpy standard functions ? I have pretty big entries so comprehension lists are too slow...

答案1

得分: 1

使用 == 对 search 和 master 中的每个项进行相等性检查。为了逐元素进行操作，我们需要使用 search[:,None] 将 search 转换成一个二维数组。
使用 all 在轴 2 上执行逻辑 AND 操作，检查 search 中的两个项是否都等于 master 中的两个项。
使用 any 在轴 0 上执行逻辑 OR 操作，将结果合并以获取每个 search 项的 True 或 False 值。
最后，使用 np.where 找到索引。

带有命名步骤的代码如下：

import numpy as np
master = np.array([[0, 0], [1, 0], [2, 0], [0, 1], [1, 1], [2, 1], [0, 2], [1, 2], [2, 2]])
search = np.array([[1, 2], [2, 0], [4, -2]])
step1 = master == search[:,None]
step2 = step1.all(2)
step3 = step2.any(0)
step4 = np.where(step3)[0]  # [2, 7]

全部合在一起的代码如下：

import numpy as np
master = np.array([[0, 0], [1, 0], [2, 0], [0, 1], [1, 1], [2, 1], [0, 2], [1, 2], [2, 2]])
search = np.array([[1, 2], [2, 0], [4, -2]])
res = np.where((master == search[:,None]).all(2).any(0))[0]   # [2, 7]

英文:

Let's break this up into multiple steps.

Equality check between each term in search and master using ==. To do this elementwise, we need to make search a 2D array using search[:,None].
Use all on axis 2 to perform a logical AND, which checks if the two terms in search are both equal to the two terms in master.
Use any on axis 0 to perform a logical OR, which collapses the result to get True or False values for each term in search.
Finally, use np.where to find the indices.

With named steps:

import numpy as np
master = np.array([[0, 0], [1, 0], [2, 0], [0, 1], [1, 1], [2, 1], [0, 2], [1, 2], [2, 2]])
search = np.array([[1, 2], [2, 0], [4, -2]])
step1 = master == search[:,None]
step2 = step1.all(2)
step3 = step2.any(0)
step4 = np.where(step3)[0]  # [2, 7]

All together:

import numpy as np
master = np.array([[0, 0], [1, 0], [2, 0], [0, 1], [1, 1], [2, 1], [0, 2], [1, 2], [2, 2]])
search = np.array([[1, 2], [2, 0], [4, -2]])
res = np.where((master == search[:,None]).all(2).any(0))[0]   # [2, 7]

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Find indices of array of values in a master array, when values are arrays.

问题

答案1

在嵌套循环中交换列表变量会导致意外结果（单一数组？）

在Pandas中，基于列A和B中出现的唯一值，计算多列C和D的值之和。

vscode PermissionError: [Errno 13] Permission denied:

Python 3.11.2 | YOLOv8 – 如何保存到自己的文件夹

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。