2020年1月7日 01:24:14go评论110阅读模式

英文:

Optimise this function -- numpy broadcasting issue

问题

以下是已经翻译好的部分：

import numpy as np
def contains(u, min, max, dim, strict = True):
    u = np.array(u).reshape(-1 ,dim)
    if strict:
        return np.all((u > min) & (u < max), axis=1)
    else:
        return np.all((u >= min) & (u <= max), axis=1)
# Usage examples : 
d = 4
min = np.random.uniform(size=d)*1/2
max = np.random.uniform(size=d)*1/2+1/2
u1 = np.random.uniform(size=d)
u2 = np.random.uniform(size=(100,d))
u3 = u2[np.repeat(False,100)]
contains(u1,min,max,d) # should return a boolean array of shape (1,)
contains(u2,min,max,d) # shape (100,)
contains(u3,min,max,d) # shape (0,)

英文:

I have a function contains that check for a given 2D array u if the box [min,max] contains each row of u. I need it to reshape u if needed, but the number of values of u will always be a multiple of d (can be zero);

I'm using the following snippet of code. This function run thousands of time. Can faster code be produced ? If you think so, any tips on how to ?

import numpy as np
def contains(u, min, max, dim, strict = True):
    u = np.array(u).reshape(-1 ,dim)
    if strict:
        return np.all((u &gt; min) &amp; (u &lt; max), axis=1)
    else:
        return np.all((u &gt;= min) &amp; (u &lt;= max), axis=1)
# Usage examples : 
d = 4
min = np.random.uniform(size=d)*1/2
max = np.random.uniform(size=d)*1/2+1/2
u1 = np.random.uniform(size=d)
u2 = np.random.uniform(size=(100,d))
u3 = u2[np.repeat(False,100)]
   
contains(u1,min,max,d) # should return a boolean array of shape (1,)
contains(u2,min,max,d) # shape (100,)
contains(u3,min,max,d) # shape (0,)

答案1

得分: 3

以下是您提供的代码的中文翻译：

瓶颈最终出现在np.all()调用内部。
可以使用Numba来加速，如下所示：

import numpy as np
import numba as nb
@nb.jit(nopython=True)
def contains_nb(arr, a_arr, b_arr):
    m = a_arr.size
    arr = arr.reshape(-1, m)
    n = arr.shape[0]
    result = np.ones(n, dtype=np.bool8)
    for i in range(n):       
        for j in range(m):
            if not a_arr[j] < arr[i, j] < b_arr[j]:
                result[i] = False
                break
    return result

这与NumPy解决方案相比：

import numpy as np
def contains_np(arr, a_arr, b_arr):
    m = a_arr.size
    arr = arr.reshape(-1, m)
    return np.all((arr >= a_arr) & (arr <= b_arr), axis=1)

我对您的方法进行了一些简化（我省略了dim和strict参数，因为dim是多余的，可以从a_arr或b_arr的大小中推断出，而strict参数对解决方案没有太多影响，但可以轻松重新引入）。
我还假设输入已经始终是NumPy数组。

另外，NumPy解决方案可以修改为使用numexpr，这将导致第三种方法。这将带来一些调用开销，但可以加速计算，例如：

import numpy as np
import numexpr as ne
def contains_ne(arr, a_arr, b_arr):
    m = a_arr.size
    arr = arr.reshape(-1, m)
    result = ne.evaluate('(arr >= a_arr) & (arr <= b_arr)')
    return np.all(result, axis=1)

可以获得以下性能基准数据：

这表明Numba解决方案始终是最快的。
相反，使用numexpr似乎对探索的参数范围没有好处。

（完整的基准数据可在此处获得）

英文:

(EDITED: to fix the timing measurement issue raised by @max9111 in the comments, and to include a numexpr-modified solution).

The bottleneck would eventually be within the np.all() call.
This could be sped up with Numba like the following:

import numpy as np
import numba as nb
@nb.jit(nopython=True)
def contains_nb(arr, a_arr, b_arr):
    m = a_arr.size
    arr = arr.reshape(-1, m)
    n = arr.shape[0]
    result = np.ones(n, dtype=np.bool8)
    for i in range(n):       
        for j in range(m):
            if not a_arr[j] &lt; arr[i, j] &lt; b_arr[j]:
                result[i] = False
                break
    return result

This is compared to the NumPy solution:

import numpy as np
def contains_np(arr, a_arr, b_arr):
    m = a_arr.size
    arr = arr.reshape(-1, m)
    return np.all((arr &gt;= a_arr) &amp; (arr &lt;= b_arr), axis=1)

which I simplified a bit over your approach (I have omitted dim and strict parameters, since dim is redundant, as it can be inferred from a_arr or b_arr sizes, while the strict parameter does not add much to the solution, but it can be easily reintroduced).
I also assume that the input is already always a NumPy array.

Also, the NumPy solution could be modified to use numexpr, which leads to a third approach. This will have some calling overhead, but could speed up the computations, e.g.:

import numpy as np
import numexpr as ne
def contains_ne(arr, a_arr, b_arr):
    m = a_arr.size
    arr = arr.reshape(-1, m)
    result = ne.evaluate(&#39;(arr &gt;= a_arr) &amp; (arr &lt;= b_arr)&#39;)
    return np.all(result, axis=1)

The following benchmarks can be obtained:

This show that the Numba solution is consistently the fastest.
On the contrary, the use of numexpr seems to be non-beneficial for the range of parameters explored.

(full benchmark available here)

答案2

得分: 2

Here is the translated content:

尝试这个以提高速度，在这里阅读更多

from numba import jit
@jit(nopython=True)
def contains(u, min, max, dim, strict=True):
    u = np.array(u).reshape(-1, dim)
    if strict:
        return np.all((u > min) & (u < max), axis=1)
    else:
        return np.all((u >= min) & (u <= max), axis=1)

英文:

Try this to speed, read more here

from numba import jit
@jit(nopython=True)
def contains(u, min, max, dim, strict = True):
    u = np.array(u).reshape(-1 ,dim)
    if strict:
        return np.all((u &gt; min) &amp; (u &lt; max), axis=1)
    else:
        return np.all((u &gt;= min) &amp; (u &lt;= max), axis=1)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Optimise this function — numpy broadcasting issue.

问题

答案1

答案2

'pytest –version' works, but 'pytest tests' command not recognized with Bash in Github Actions

如何在Jinja模板中的for循环中累加数组中的浮点数值？

如何找出值从它们的周期开始时发生了怎样的变化？

如何在包含列表的Pandas数据框中执行成对相关性？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。