英文:
Finding a fast optimization algorithm to solve a non-linear equation with unique positive solution
问题
目标:
在Python中找到一个快速的算法来解决下面的函数f(x)的正解。
def f(x):
return (l / ((np.tile(r, (n, 1)).transpose() / D / (np.tile(x, (n, 1)) / D).sum(axis = 0)).sum(axis = 1))) - x
其中,l、r和x是n维向量,D是一个n x n的矩阵。
已知该函数有一个正解,其在缩放因子上是唯一的。我想要针对不同的数据和不同的向量长度n解决这个函数。最大的n约为4000。
迄今为止我尝试过的方法:
我尝试了各种scipy.optimize
函数。首先,我尝试了fsolve
,但这似乎不合适,因为它有时会给出具有负值的解向量x。根据一个相关问题的答案,我尝试了minimize
,并将解限制为正数,以避免解中出现负值。最小化只有在提供正确的解作为起始值时才能找到解函数的全局最小值。当起始值不同(略有不同)时,得到的向量不解决方程(而我需要精确解)。我认为该算法找到了局部最小值而不是全局最小值。为了找到全局最小值,我尝试了differential evolution
。问题在于对于任何有用的n来说,它的速度都非常慢。我所有的测试都是用n = 5进行的,它可以找到正确的解。
问题:
哪些算法适合解决这个方程?(如何)可以利用我对方程的了解来加速计算?(即存在正解,它在缩放上是唯一的)
最小的可工作示例:
import numpy as np
from scipy.optimize import minimize, fsolve, differential_evolution
np.random.seed(1)
# 向量维度
n = 250 # 差分进化较慢,最好使用n = 5来测试
# 数据r和D
r = np.random.rand(n)
D = 1 + np.random.rand(n, n)
# 真解x
x_true = np.random.rand(n)
# 将x归一化为具有几何平均值1
x_true = x_true / np.prod(x_true) ** (1/n)
# 求解由真实x隐含的l
l = ((np.tile(r, (n, 1)).transpose() / D) / (np.tile(x_true, (n, 1)) / D).sum(axis = 0)).sum(axis = 1) * x_true
### Fsolve
initial_guess_deviation_factor = 2
x_0 = x_true * np.random.uniform(low = 0.9, high = 1.1, size = n) ** initial_guess_deviation_factor
def f(x):
return (l / ((np.tile(r, (n, 1)).transpose() / D / (np.tile(x, (n, 1)) / D).sum(axis = 0)).sum(axis = 1))) - x
# 解是负的
x = fsolve(f, x_0)
### Minimize
def opt(x):
return (((l / ((np.tile(r, (n, 1)).transpose() / D / (np.tile(x, (n, 1)) / D).sum(axis = 0)).sum(axis = 1))) - x) ** 2).sum()
def pos_constraint(x):
return x
result = minimize(opt, x0=x_0, constraints={'type': 'ineq', 'fun':pos_constraint}, tol = 1e-18)
# 解与真解不同
print(abs(result.x - x_true).mean())
print(result.fun)
### Differential evolution
def opt(x):
return (((l / ((np.tile(r, (n, 1)).transpose() / D / (np.tile(x, (n, 1)) / D).sum(axis = 0)).sum(axis = 1))) - x) ** 2).sum()
# 由于解在重标定上是唯一的,我使用0到1之间的边界,在找到解后进行重标定
bounds = [(0, 1)] * n
result = differential_evolution(opt, bounds, seed=1)
result.x, result.fun
# 归一化解
x_de = result.x / np.prod(result.x) ** (1/n)
print(abs(x_de - x_true).mean())
print(result.fun)
英文:
Goal:
Find a fast algorithm in Python that solves the function f(x) below for its positive solution.
def f(x):
return (l / ((np.tile(r, (n, 1)).transpose() / D / (np.tile(x, (n, 1)) / D).sum(axis = 0)).sum(axis = 1))) - x
l, r, and x are vectors of dimension n and D is a matrix of dimension n x n.
The function is known to have a positive solution that is unique up to a scaling factor. I would like to solve the function for different data and different vector length n. The largest n is approximately 4000.
What I have tried so far:
I tried various scipy.optimize
functions. First, I tried fsolve
which does not seem appropriate because it sometimes gives a solution vector x with negative entries. Following the answers to a related question, I tried minimize
and constraining the solution to positive numbers to avoid negative entries in the solution. Minimize finds the global minimum that solves the function only when provided with the correct solution as a starting value. When the starting value differs (slighlty), the resulting vector does not solve the equation (and I need the exact solution). I assume that the algorithm finds local minima but not the global one. To find the global minimum I tried differential evolution
. Here, the problem is that it is very slow for any useful n. I did all testing with n = 5 for which it finds the correct solution.
Question:
Which algorithms are good candidates to solve this equation? (How) Can I use what I know about the equation to speed up the calculation? (i.e. a positive solution exists, it is unique up to scaling)
Minimal working example:
import numpy as np
from scipy.optimize import minimize, fsolve, differential_evolution
np.random.seed(1)
# Vector dimension
n = 250 # differential evolution is slow, better use n = 5 to test
# Data r and D
r = np.random.rand(n)
D = 1 + np.random.rand(n, n)
# True solution x
x_true = np.random.rand(n)
# Normalize x to have geometric mean 1
x_true = x_true / np.prod(x_true) ** (1/n)
# Solve for l implied by true x
l = ((np.tile(r, (n, 1)).transpose() / D) / (np.tile(x_true, (n, 1)) / D).sum(axis = 0)).sum(axis = 1) * x_true
### Fsolve
initial_guess_deviation_factor = 2
x_0 = x_true * np.random.uniform(low = 0.9, high = 1.1, size = n) ** initial_guess_deviation_factor
def f(x):
return (l / ((np.tile(r, (n, 1)).transpose() / D / (np.tile(x, (n, 1)) / D).sum(axis = 0)).sum(axis = 1))) - x
# The solution is negative
x = fsolve(f, x_0)
### Minimize
def opt(x):
return (((l / ((np.tile(r, (n, 1)).transpose() / D / (np.tile(x, (n, 1)) / D).sum(axis = 0)).sum(axis = 1))) - x) ** 2).sum()
def pos_constraint(x):
return x
result = minimize(opt, x0=x_0, constraints={'type': 'ineq', 'fun':pos_constraint}, tol = 1e-18)
# The solution is different from the true solution
print(abs(result.x - x_true).mean())
print(result.fun)
### Differential evolution
def opt(x):
return (((l / ((np.tile(r, (n, 1)).transpose() / D / (np.tile(x, (n, 1)) / D).sum(axis = 0)).sum(axis = 1))) - x) ** 2).sum()
# Since the solution is unique up to renormalization, I use bounds between 0 and 1 and renormalize after finding the solution
bounds = [(0, 1)] * n
result = differential_evolution(opt, bounds, seed=1)
result.x, result.fun
# Normalize solution
x_de = result.x / np.prod(result.x) ** (1/n)
print(abs(x_de - x_true).mean())
print(result.fun)
答案1
得分: 1
以下是代码部分的翻译:
import numpy as np
from numpy.random._generator import default_rng
from scipy.optimize import fsolve
rand = default_rng(seed=0)
n = 250 # 差分进化较慢,最好使用 n = 5 进行测试
r = rand.random(n)
D = rand.uniform(low=1, high=2, size=(n, n))
rDD = r[:, np.newaxis] / (1/D).sum(axis=0) / D
# 真实解 x,其几何平均值为 1
x_true = rand.random(n)
x_true = x_true / x_true.prod() ** (1 / n)
# 求解真实 x 所暗示的 l
l = rDD @ (1 / x_true) * x_true
def f(x: np.ndarray) -> np.ndarray:
return l / (rDD @ (1 / x)) - x
def regression_test() -> None:
assert l.shape == (250,)
assert np.isclose(l[0], 1.8437187927094683)
assert np.isclose(l.min(), 0.008011379562766192)
assert np.isclose(l.max(), 4.870546152196413)
assert np.isclose(l.sum(), 328.4768373368165)
f_0 = f(x_0)
assert f_0.shape == (250,)
assert np.isclose(f_0[0], -0.11599601776615498)
assert np.isclose(f_0.min(), -0.5897953289580671)
assert np.isclose(f_0.max(), 0.3885530509026145)
assert np.isclose(f_0.sum(), -9.253079363636402)
initial_guess_deviation_factor = 2
x_0 = x_true * rand.uniform(low=0.9, high=1.1, size=n) ** initial_guess_deviation_factor
regression_test()
你的第一次尝试非常接近于既快速又准确的结果;你只需要除以一个项,其余部分将会是非负的:
x = fsolve(f, x_0)
x /= x[0]
print(x)
print(f(x))
打印出以下结果的 x 和 f(x):
[1. 0.21849311 0.62362266 1.37331647 1.0126681 1.20579018 ...
以及以下结果的 f(x):
[ 1.18594023e-12 -2.77555756e-15 -1.30429001e-12 -7.40074668e-13 ...
英文:
First, do some linear-algebraic analysis to derive the following simplified, equivalent form of your problem (and include regression tests):
import numpy as np
from numpy.random._generator import default_rng
from scipy.optimize import fsolve
rand = default_rng(seed=0)
n = 250 # differential evolution is slow, better use n = 5 to test
r = rand.random(n)
D = rand.uniform(low=1, high=2, size=(n, n))
rDD = r[:, np.newaxis] / (1/D).sum(axis=0) / D
# True solution x with geometric mean 1
x_true = rand.random(n)
x_true = x_true / x_true.prod() ** (1 / n)
# Solve for l implied by true x
l = rDD @ (1 / x_true) * x_true
def f(x: np.ndarray) -> np.ndarray:
return l / (rDD @ (1 / x)) - x
def regression_test() -> None:
assert l.shape == (250,)
assert np.isclose(l[0], 1.8437187927094683)
assert np.isclose(l.min(), 0.008011379562766192)
assert np.isclose(l.max(), 4.870546152196413)
assert np.isclose(l.sum(), 328.4768373368165)
f_0 = f(x_0)
assert f_0.shape == (250,)
assert np.isclose(f_0[0], -0.11599601776615498)
assert np.isclose(f_0.min(), -0.5897953289580671)
assert np.isclose(f_0.max(), 0.3885530509026145)
assert np.isclose(f_0.sum(), -9.253079363636402)
initial_guess_deviation_factor = 2
x_0 = x_true * rand.uniform(low=0.9, high=1.1, size=n) ** initial_guess_deviation_factor
regression_test()
Maybe I'm missing something, but your first attempt is very close at being both fast and accurate; you just need to divide out one term and the rest will be non-negative:
x = fsolve(f, x_0)
x /= x[0]
print(x)
print(f(x))
prints the following for x:
[1. 0.21849311 0.62362266 1.37331647 1.0126681 1.20579018 ...
and the following for f(x):
[ 1.18594023e-12 -2.77555756e-15 -1.30429001e-12 -7.40074668e-13 ...
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论