2023年6月8日 00:54:10go评论111阅读模式

英文:

Scipy optimize: limiting number of non-zero variables

问题

我的优化问题有约200个变量，但我希望找到一个只使用其中5个变量的解决方案。也就是说，195个变量应该为零，而另外5个可以非零。

我尝试了以下约束条件，但似乎优化算法完全忽略它，因为它继续使用所有200个变量，无论我是否包括约束条件。我是否遗漏了什么，或者SLSQP无法处理这个问题？

import pandas as pd
import numpy as np
from scipy.optimize import minimize, Bounds
tmp = pd.DataFrame()
tmp.insert(loc=0, column='pred', value=np.random.random(200))
tmp['x0'] = 0
def obj(x, df=tmp):
  return np.sum(-x * df['pred'].values)
def c2(x):
  return -(len(x[x!=0])-5)
sol = minimize(fun=obj, x0=tmp['x0'], method='SLSQP', bounds=Bounds(-10, 10), jac=False,
  constraints=({'type': 'ineq', 'fun': c2}),
  options={'maxiter': 1000})

当我运行这个代码时，它只是将所有变量设置为10，忽略了c2。

英文:

My optimization problem has ~200 variables but I would like to find a solution that only uses 5 of them. As in 195 should be zero and the other 5 can be non zero.

I have tried the following constraint, but it seems that the optimization algorithm completely ignores it, as it continues to use all 200 variables whether I include the constraint or not. Is there something I am missing or can SLSQP just not handle this?

import pandas as pd
import numpy as np
from scipy.optimize import minimize, Bounds
tmp = pd.DataFrame()
tmp.insert(loc=0,column=&#39;pred&#39;, value = np.random.random(200))
tmp[&#39;x0&#39;] = 0
def obj(x, df = tmp):
  return np.sum(-x * df[&#39;pred&#39;].values)
def c2(x):
  return -(len(x[x!=0])-5)
sol = minimize(fun=obj,x0=tmp[&#39;x0&#39;],method=&#39;SLSQP&#39;,bounds=Bounds(-10,10),jac=False,
  constraints=({&#39;type&#39;: &#39;ineq&#39;, &#39;fun&#39;: c2}),
  options={&#39;maxiter&#39;: 1000})

When I run this it just sets everything to 10 and ignores c2.

答案1

得分: 3

slsqp（以及minimize）对于这样的问题来说并不实际，但你仍然可以使用scipy。请改用线性规划，其中所有变量都是二进制赋值。由于你的目标是线性的，因为它只是一个点积。

import numpy as np
from numpy.random import default_rng
from scipy.optimize import linprog
rand = default_rng(seed=0)
pred = rand.random(200)
# 我们忽略了因子10。最大解意味着所有选择的变量将乘以10。
result = linprog(
    c=-pred,        # 最大化pred与x的点积
    bounds=(0, 1),  # 所有选择变量都是二进制的
    integrality=np.ones_like(pred, dtype=bool),
    # 需要使用确切的5个变量
    A_eq=np.ones((1, len(pred))), b_eq=[5],
)
print(result.message)
assert result.success
idx, = result.x.nonzero()
print('使用了这些索引：', idx)
print('使用了这些值：', pred[idx])

成功终止优化。 (HiGHS状态 7: 最优)
使用了这些索引： [ 26  77  94 171 194]
使用了这些值： [0.99720994 0.99509651 0.98119504 0.99491735 0.98194269]

但实际上，由于pred是非负的，这更简单：

import numpy np
from numpy.random import default_rng
rand = default_rng(seed=0)
pred = rand.random(200)
print('使用这些值：', np.sort(pred)[-5:])

使用这些值： [0.98119504 0.98194269 0.99491735 0.99509651 0.99720994]

结果是一样的。

英文:

slsqp (and indeed minimize) are not practical for such a problem, but you can still use scipy. Use a linear program instead where all variables are binary assignments. Your objective is linear since it's just a dot product.

import numpy as np
from numpy.random import default_rng
from scipy.optimize import linprog
rand = default_rng(seed=0)
pred = rand.random(200)
# We disregard the factor of 10. The maximum solution implies
# that all selected variables will be multiplied by 10.
result = linprog(
    c=-pred,        # maximize dot product of pred with x
    bounds=(0, 1),  # all selection variables binary
    integrality=np.ones_like(pred, dtype=bool),
    # Exactly 5 of the variables need to be used
    A_eq=np.ones((1, len(pred))), b_eq=[5],
)
print(result.message)
assert result.success
idx, = result.x.nonzero()
print(&#39;These indices were used:&#39;, idx)
print(&#39;These values were used:&#39;, pred[idx])

Optimization terminated successfully. (HiGHS Status 7: Optimal)
These indices were used: [ 26  77  94 171 194]
These values were used: [0.99720994 0.99509651 0.98119504 0.99491735 0.98194269]

But really, since pred is non-negative, this is much more simple as

import numpy as np
from numpy.random import default_rng
rand = default_rng(seed=0)
pred = rand.random(200)
print(&#39;Use these values:&#39;, np.sort(pred)[-5:])

Use these values: [0.98119504 0.98194269 0.99491735 0.99509651 0.99720994]

The results are the same.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

Scipy优化：限制非零变量数量

问题

答案1

Python YouTube API v3 使用 API 密钥上传视频

在NumPy数组中有条件地替换列。

How can I find a row inside pandas DataFrame with row data?

How to create a smaller dataframe from an existing dataframe with the same numbers per label

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。