2023年2月7日 04:41:43go评论68阅读模式

英文:

numpy polyfit alternative for speed

问题

我正在多次使用numpy的polyfit进行计算，以获取两个数据集之间的斜率。然而，它执行这些计算的速度不够快，不符合期望。

关于这些计算有两点需要注意：

1）在调用numpy.polyfit(x,y,n)时，x的值将始终是相同的斜率值，而

2）n的值为1。因此，这是一个线性回归。

我知道有许多不同的替代方法，包括numpy.polynomial.polynomial.polyfit(x,y,n)，但它们似乎提供了相同的慢速性能。我尝试过使用np.linalg，但运行不正常。因此，我想知道是否有什么替代方法可以加速计算？

英文:

I am utilizing numpy's polyfit a numerous amount of times to run calculations and get the slope between two datasets. However, the speed at which it performs these calculations are not fast enough for what would be desired.

Two things to note about the calculations:

The value for x in the call numpy.polyfit(x,y,n) will always be the same slope value, and
The value for n = 1. So it is a linear regression.

I know there are many different alternatives, including numpy.polynomial.polynomial.polyfit(x,y,n), but they seem to provide the same slow speed performance. I have had little luck getting np.linalg to work properly. Therefore, I am wondering what might be an alternative to speed up calculations?

答案1

得分: 3

以下是翻译好的内容：

"Using numpy.linalg.lstsq, this could look like:"（使用numpy.linalg.lstsq函数，代码如下：）

import numpy as np

def lstsq(x, y):
    a = np.stack((x, np.ones_like(x)), axis=1)
    return np.linalg.lstsq(a, y)[0]

"This offers a slight speed improvement over polyfit."（这比polyfit稍微快一点。）

"To obtain a significant speed increase (at the expense of numerical stability - for a summary of methods see Numerical methods for linear least squares) you can instead solve the normal equations:"（要获得显著的速度提升（代价是数值稳定性 - 有关方法的摘要，请参阅最小二乘法的数值方法），您可以解正规方程：）

def normal(x, y):
    a = np.stack((x, np.ones_like(x)), axis=1)
    aT = a.T
    return np.linalg.solve(aT@a, aT@y)

"As you say that x is constant, you can precompute a.T@a providing a further speed increase:"（正如您所说，x是常数，您可以预先计算a.T@a，从而进一步提高速度：）

def normal2(aT, aTa,  y):
    return np.linalg.solve(aTa, aT@y)

"Make up some test data and time:"（生成一些测试数据并计时：）

rng = np.random.default_rng()
N = 1000

x = rng.random(N)
y = rng.random(N)

a = np.stack((x, np.ones_like(x)), axis=1)
aT = a.T
aTa = aT@a

assert np.allclose(lstsq(x, y), np.polyfit(x, y, 1))
assert np allclose(normal(x, y), np.polyfit(x, y, 1))
assert np.allclose(normal2(aT, aTa, y), np.polyfit(x, y, 1))
%timeit np.polyfit(x, y, 1) 
%timeit lstsq(x, y)
%timeit normal(x, y)
%timeit normal2(aT, aTa, y)

"Output:"（输出：）

256 &#181;s &#177; 270 ns per loop (mean &#177; std. dev. of 7 runs, 1,000 loops each)
220 &#181;s &#177; 1.87 &#181;s per loop (mean &#177; std. dev. of 7 runs, 1,000 loops each)
20.2 &#181;s &#177; 32.3 ns per loop (mean &#177; std. dev. of 7 runs, 10,000 loops each)
6.54 &#181;s &#177; 13.5 ns per loop (mean &#177; std. dev. of 7 runs, 100,000 loops each)

英文:

As others have commented, this can be done using linear least squares.
Using numpy.linalg.lstsq, this could look like:

import numpy as np

def lstsq(x, y):
    a = np.stack((x, np.ones_like(x)), axis=1)
    return np.linalg.lstsq(a, y)[0]

This offers a slight speed improvement over polyfit. To obtain a significant speed increase (at the expense of numerical stability - for a summary of methods see Numerical methods for linear least squares) you can instead solve the normal equations:

def normal(x, y):
    a = np.stack((x, np.ones_like(x)), axis=1)
    aT = a.T
    return np.linalg.solve(aT@a, aT@y)

As you say that x is constant, you can precompute a.T@a providing a further speed increase:

def normal2(aT, aTa,  y):
    return np.linalg.solve(aTa, aT@y)

Make up some test data and time:

rng = np.random.default_rng()
N = 1000

x = rng.random(N)
y = rng.random(N)

a = np.stack((x, np.ones_like(x)), axis=1)
aT = a.T
aTa = aT@a

assert np.allclose(lstsq(x, y), np.polyfit(x, y, 1))
assert np.allclose(normal(x, y), np.polyfit(x, y, 1))
assert np.allclose(normal2(aT, aTa, y), np.polyfit(x, y, 1))
%timeit np.polyfit(x, y, 1) 
%timeit lstsq(x, y)
%timeit normal(x, y)
%timeit normal2(aT, aTa, y)

Output:

256 &#181;s &#177; 270 ns per loop (mean &#177; std. dev. of 7 runs, 1,000 loops each)
220 &#181;s &#177; 1.87 &#181;s per loop (mean &#177; std. dev. of 7 runs, 1,000 loops each)
20.2 &#181;s &#177; 32.3 ns per loop (mean &#177; std. dev. of 7 runs, 10,000 loops each)
6.54 &#181;s &#177; 13.5 ns per loop (mean &#177; std. dev. of 7 runs, 100,000 loops each)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

numpy polyfit 替代品以提高速度

问题

答案1

Want to create alexa skill such that alexa will ask 4-5 questions to user and user will responds after every answer and that will be stored in db?

“-mzeep –profile” 的输出是什么？

为什么我在使用pip时一直收到这些错误，该如何解决？

dict_items类未显示正确的类继承。

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论