在CVXPY中进行的两次矩阵乘法操作会导致未知的曲率,并使问题不符合DCP。

huangapple go评论68阅读模式
英文:

Two matrix multiplication operations in CVXPY gives UNKNOWN curvature and makes the problem not DCP

问题

I want to solve the following optimization problem using cvxpy:

最大化(y @ A @ x)

其中A是一个大小为(n, m)的正值元素矩阵;
x是长度为m的布尔向量;
y是长度为n的布尔向量。

约束条件为:
sum(x)==1sum(y)==1

例如,如果 A = [[ 87, 96, 127, 46], [155, 166, 92, 11], [111, 163, 126, 112]]
则解为:
x = [0,1,0,0]
y = [0,1,0]
导致:y @ A @ x = 166,这是在约束条件下的最大值。

然而,cvxpy报错并显示:

DCPError: 问题不符合DCP规则。具体而言:
目标函数不是DCP。以下子表达式也不是:
var27 @ [[ 87. 96. 127. 46.]
[155. 166. 92. 11.]
[111. 163. 126. 112.]] @ var26

表达式y @ A @ x 被评估为Expression(UNKNOWN, NONNEGATIVE, ()),表明曲率未知,尽管A @ x和y @ A都是AFFINE和NONNEGATIVE。

对于这个看似简单的问题,使用cvxpy要如何解决呢?

我的代码是:

import cvxpy as cp
import numpy as np

rows = 3
cols = 4

A = np.random.randint(0,255,size=(rows,cols))
x =  cp.Variable(cols, boolean=True)
y =  cp.Variable(rows, boolean=True)

objective = cp.Maximize(y @ A @ x)
constraints = [cp.sum(x)==1,cp.sum(y)==1]
prob = cp.Problem(objective,constraints)

prob.solve()

我尝试对randint生成的矩阵添加属性,如下所示:

a = np.random.randint(0,255,size=(rows,cols))
A = cp.Variable((rows,cols),nonneg=True)
A.value = A.project(a)

我尝试将它变成正定矩阵,尝试设置pos=True。
我还尝试设置prob.solve(qcp=True)
但所有这些仍然导致上述关于DCP的错误消息。

英文:

I want to solve the following optimization problem using cvxpy:

Maximize(y @ A @ x)

where A is a (n,m)-size matrix of positive valued elements;
x is boolean vector of is length m;
y is boolean vector is length n.

The constraints are:
sum(x)==1 and sum(y)==1

For example, if A = [[ 87, 96, 127, 46], [155, 166, 92, 11], [111, 163, 126, 112]]
then the solution is:
x = [0,1,0,0]
y = [0,1,0]
resulting in: y @ A @ x = 166, which is the largest value given the constraints.

However, cvxpy throws an error and says:
> DCPError: Problem does not follow DCP rules. Specifically:
> The objective is not DCP. Its following subexpressions are not:
> var27 @ [[ 87. 96. 127. 46.]
> [155. 166. 92. 11.]
> [111. 163. 126. 112.]] @ var26

The expression y @ A @ x is evaluated to Expression(UNKNOWN, NONNEGATIVE, ()), suggesting the curvature is unknown even though A @ x and y @ A are both AFFINE and NONNEGATIVE.

What is the workaround for this seemingly simple problem to make it work using cvxpy?

My code is:

import cvxpy as cp
import numpy as np

rows = 3
cols = 4

A = np.random.randint(0,255,size=(rows,cols))
x =  cp.Variable(cols, boolean=True)
y =  cp.Variable(rows, boolean=True)

objective = cp.Maximize(y @ A @ x)
constraints = [cp.sum(x)==1,cp.sum(y)==1]
prob = cp.Problem(objective,constraints)

prob.solve()

I tried doing the following to give an attribute to the randint matrix:

a = np.random.randint(0,255,size=(rows,cols))
A = cp.Variable((rows,cols),nonneg=True)
A.value = A.project(a)

I tried making it a square matrix with PSD=True. Tried, pos=True.
I also tried setting prob.solve(qcp=True).
All of these still resulted in the above error message about DCP.

答案1

得分: 1

Nonlinear Optimisation

import numpy as np
from scipy.optimize import minimize, LinearConstraint, Bounds

A = np.array([
    [ 87,  96, 127,  46],
    [155, 166,  92,  11],
    [111, 163, 126, 112],
])
n, m = A.shape


def objective(xy: np.ndarray) -> float:
    x = xy[:m]
    y = xy[-n:]
    return -y @ A @ x


Ac = np.zeros((2, m+n))
Ac[0, :m] = 1
Ac[1, -n:] = 1

x0 = np.empty(m + n)
x0[:m] = 1/m
x0[-n:] = 1/n

result = minimize(
    fun=objective, x0=x0,
    bounds=Bounds(lb=np.zeros_like(x0), ub=np.ones_like(x0)),
    constraints=LinearConstraint(A=Ac, lb=(1,1), ub=(1,1)),
)
print(result)
print('x ~', result.x[:m].round(3))
print('y ~', result.x[-n:].round(3))
print('Constraint residual:', Ac.dot(result.x) - 1)

Linear Program

import numpy as np
from scipy.optimize import milp, Bounds, LinearConstraint

A = np.array([
    ( 87,  96, 127,  46),
    (155, 166,  92,  11),
    (111, 163, 126, 112),
])
n, m = A.shape

# Objective coefficients: maximize the n*m implied products in A
c = np.full(n + m + n*m, -1)
c[:n+m] = 0

# y, x are integral
integrality = np.zeros(n + m + n*m)
integrality[:n+m] = 1

# y, x are binary; Axy vary between 0 and each A
lb = np.zeros(n + m + n*m)
ub = np.ones(n + m + n*m)
ub[n+m:] = A.ravel()

# constraints: for each Axy,  Axy <= A*y    0 <= A*y - Axy
#                             Axy <= A*x    0 <= A*x - Axy
Acy = np.repeat(np.eye(n), m, axis=0); Acy[Acy.nonzero()] = A.ravel()
Acx = np.tile(np eye(m), (n, 1));      Acx[Acx.nonzero()] = A.ravel()
Auy = np.zeros_like(lb);               Auy[:n] = 1
Aux = np.zeros_like(lb);               Aux[n: n+m] = 1

Ac = np.vstack((
    np.hstack((Acy, np.zeros((m*n, m)), -np.eye(m*n))),
    np.hstack((np.zeros((m*n, n)), Acx, -np.eye(m*n))),
    Auy, Aux,
))
lbc = np.ones(2*m*n + 2); lbc[:-2] = 0
ubc = np.ones_like(lbc);  ubc[:-2] = np.inf

result = milp(
    c=c, integrality=integrality,
    bounds=Bounds(lb=lb, ub=ub),
    constraints=LinearConstraint(A=Ac, lb=lbc, ub=ubc),
    options={'disp': True},
)
print(result)
print('y =', result.x[:n])
print('x =', result.x[n: n+m])
print('Axy =')
print(result.x[-n*m:].reshape((n, m)))

Direct Solution

import numpy as np

A = np.array([
    ( 87,  96, 127,  46),
    (155, 166,  92,  11),
    (111, 163, 126, 112),
])
n, m = A.shape
i = A.argmax()
y = np.zeros(n, dtype=int)
x = np.zeros(m, dtype=int)
y[i // m] = 1
x[i % m] = 1

print(y)
print(x)
英文:

Nonlinear Optimisation

If you decide not to linearise and you're OK carrying the risk of inexact solutions for some classes of A, then

import numpy as np
from scipy.optimize import minimize, LinearConstraint, Bounds

A = np.array([
    [ 87,  96, 127,  46],
    [155, 166,  92,  11],
    [111, 163, 126, 112],
])
n, m = A.shape


def objective(xy: np.ndarray) -&gt; float:
    x = xy[:m]
    y = xy[-n:]
    return -y @ A @ x


Ac = np.zeros((2, m+n))
Ac[0, :m] = 1
Ac[1, -n:] = 1

x0 = np.empty(m + n)
x0[:m] = 1/m
x0[-n:] = 1/n

result = minimize(
    fun=objective, x0=x0,
    bounds=Bounds(lb=np.zeros_like(x0), ub=np.ones_like(x0)),
    constraints=LinearConstraint(A=Ac, lb=(1,1), ub=(1,1)),
)
print(result)
print(&#39;x ~&#39;, result.x[:m].round(3))
print(&#39;y ~&#39;, result.x[-n:].round(3))
print(&#39;Constraint residual:&#39;, Ac.dot(result.x) - 1)
 message: Optimization terminated successfully
success: True
status: 0
fun: -166.0
x: [ 0.000e+00  1.000e+00  0.000e+00  0.000e+00  0.000e+00
1.000e+00  0.000e+00]
nit: 7
jac: [-1.550e+02 -1.660e+02 -9.200e+01 -1.100e+01 -9.600e+01
-1.660e+02 -1.630e+02]
nfev: 48
njev: 6
x ~ [0. 1. 0. 0.]
y ~ [0. 1. 0.]
Constraint residual: [0. 0.]

which matches your demonstrated output.

Linear Program

Framed as a linear program,

import numpy as np
from scipy.optimize import milp, Bounds, LinearConstraint

A = np.array([
    ( 87,  96, 127,  46),
    (155, 166,  92,  11),
    (111, 163, 126, 112),
])
n, m = A.shape

# Objective coefficients: maximize the n*m implied products in A
c = np.full(n + m + n*m, -1)
c[:n+m] = 0

# y, x are integral
integrality = np.zeros(n + m + n*m)
integrality[:n+m] = 1

# y, x are binary; Axy vary between 0 and each A
lb = np.zeros(n + m + n*m)
ub = np.ones(n + m + n*m)
ub[n+m:] = A.ravel()

# constraints: for each Axy,  Axy &lt;= A*y    0 &lt;= A*y - Axy
#                             Axy &lt;= A*x    0 &lt;= A*x - Axy
Acy = np.repeat(np.eye(n), m, axis=0); Acy[Acy.nonzero()] = A.ravel()
Acx = np.tile(np.eye(m), (n, 1));      Acx[Acx.nonzero()] = A.ravel()
Auy = np.zeros_like(lb);               Auy[:n] = 1
Aux = np.zeros_like(lb);               Aux[n: n+m] = 1

Ac = np.vstack((
    np.hstack((Acy, np.zeros((m*n, m)), -np.eye(m*n))),
    np.hstack((np.zeros((m*n, n)), Acx, -np.eye(m*n))),
    Auy, Aux,
))
lbc = np.ones(2*m*n + 2); lbc[:-2] = 0
ubc = np.ones_like(lbc);  ubc[:-2] = np.inf

result = milp(
    c=c, integrality=integrality,
    bounds=Bounds(lb=lb, ub=ub),
    constraints=LinearConstraint(A=Ac, lb=lbc, ub=ubc),
    options={&#39;disp&#39;: True},
)
print(result)
print(&#39;y =&#39;, result.x[:n])
print(&#39;x =&#39;, result.x[n: n+m])
print(&#39;Axy =&#39;)
print(result.x[-n*m:].reshape((n, m)))
Running HiGHS 1.2.0 [date: 2021-07-09, git hash: n/a]
Copyright (c) 2022 ERGO-Code under MIT licence terms
Presolving model
26 rows, 19 cols, 55 nonzeros
24 rows, 17 cols, 58 nonzeros
Objective function is integral with scale 1
Solving MIP model with:
24 rows
17 cols (5 binary, 0 integer, 12 implied int., 0 continuous)
58 nonzeros
Nodes      |    B&amp;B Tree     |            Objective Bounds              |  Dynamic Constraints |       Work      
Proc. InQueue |  Leaves   Expl. | BestBound       BestSol              Gap |   Cuts   InLp Confl. | LpIters     Time
0       0         0   0.00%   -1292           inf                  inf        0      0      0         0     0.0s
R       0       0         0   0.00%   -374.3333333    -96              289.93%        0      0      0        17     0.0s
L       0       0         0   0.00%   -206.6666667    -163              26.79%       19      4      7        39     0.0s
H       0       0         0   0.00%   -186.2666667    -166              12.21%       21      6     17        46     0.0s
40.0% inactive integer columns, restarting
Model after restart has 11 rows, 8 cols (3 bin., 0 int., 5 impl., 0 cont.), and 24 nonzeros
0       0         0   0.00%   -186.2666667    -166              12.21%        2      0      0        56     0.0s
Solving report
Status            Optimal
Primal bound      -166
Dual bound        -166
Gap               0% (tolerance: 0.01%)
Solution status   feasible
-166 (objective)
0 (bound viol.)
0 (int. viol.)
0 (row viol.)
Timing            0.03 (total)
0.00 (presolve)
0.00 (postsolve)
Nodes             1
LP iterations     62 (total)
0 (strong br.)
29 (separation)
10 (heuristics)
message: Optimization terminated successfully. (HiGHS Status 7: Optimal)
success: True
status: 0
fun: -165.99999999999997
x: [ 0.000e+00  1.000e+00 ...  0.000e+00 -0.000e+00]
mip_node_count: 1
mip_dual_bound: -165.99999999999997
mip_gap: 0.0
y = [ 0.  1. -0.]
x = [ 0.  1. -0.  0.]
Axy =
[[ 0.00000000e+00 -4.73695157e-15  0.00000000e+00 -0.00000000e+00]
[-0.00000000e+00  1.66000000e+02  0.00000000e+00 -0.00000000e+00]
[-0.00000000e+00 -0.00000000e+00  0.00000000e+00 -0.00000000e+00]]

Direct Solution

I don't think any of the above is necessary. Since there will only be one nonzero in y and x, then

import numpy as np

A = np.array([
    ( 87,  96, 127,  46),
    (155, 166,  92,  11),
    (111, 163, 126, 112),
])
n, m = A.shape
i = A.argmax()
y = np.zeros(n, dtype=int)
x = np.zeros(m, dtype=int)
y[i // m] = 1
x[i % m] = 1

print(y)
print(x)
[0 1 0]
[0 1 0 0]

答案2

得分: 0

I can linearize the problem by solving for a single variable w that encodes the values of x and y. Since the constraint for both x and y is that there is only 1 non-zero element and y @ A @ x only chooses 1 element from matrix A - then we can solve for A_flat @ w, where A_flat is the flattened matrix and the constraint for w is that there is only one non-zero element. In this way the problem can be solved using cvxpy. Solving for the example in the question:

A = np.asarray([[87,  96, 127,  46], [155, 166,  92,  11], [111, 163, 126, 112]])
Af = cp.vec(A) #Flattened into vector array
w = cp.Variable(A.size, boolean=True) #encodes x and y boolean array values
objective = cp.Maximize(Af @ w) #Equivalent to y @ A @ x
constraints = [cp.sum(w)==1] #constraint boolean array to only have only 1 non-zero value
prob = cp.Problem(objective,constraints)
prob.solve() #solve using cvxpy
xy = np.reshape(w.value,(A.shape[0],A.shape[1]),order="F")
index = np.argwhere(xy==1)
x = xy[index[0][0],:] #solution for the x boolean array
y = xy[:,index[0][1]] #solution for the y boolean array

Then the result is:

A @ w = 166.0
y @ A @ x = 166.0

The new formulation is equal to the old one so the problem is solved.

Edit: Shorter method

Similar to above, but we can reduce two lines of code by having variable w the same shape as matrix A and changing the form of the objective function to cp.sum(cp.multiply(A,w)):

w = cp.Variable(A.shape, boolean=True)
objective = cp.Maximize(cp.sum(cp.multiply(A,w)))
constraints = [cp.sum(w)==1] #constraint boolean array to only have only 1 non-zero value
prob = cp.Problem(objective,constraints)
prob.solve() #solve using cvxpy
index = np.argwhere(w.value==1)
x = w.value[index[0][0],:] #solution for the x boolean array
y = w.value[:,index[0][1]] #solution for the y boolean array
英文:

I can linearize the problem by solving for a single variable w that encodes the values of x and y. Since the constraint for both x and y is that there is only 1 non-zero element and y @ A @ x only chooses 1 element from matrix A - then we can solve for A_flat @ w, where A_flat is the flattened matrix and the constraint for w is that there is only one non-zero element. In this way the problem can be solved using cvxpy. Solving for the example in the question:

A = np.asarray([[87,  96, 127,  46], [155, 166,  92,  11], [111, 163, 126, 112]])
Af = cp.vec(A) #Flattened into vector array
w = cp.Variable(A.size, boolean=True) #encodes x and y boolean array values
objective = cp.Maximize(Af @ w) #Equivalent to y @ A @ x
constraints = [cp.sum(w)==1] #constraint boolean array to only have only 1 non-zero value
prob = cp.Problem(objective,constraints)
prob.solve() #solve using cvxpy
xy = np.reshape(w.value,(A.shape[0],A.shape[1]),order=&quot;F&quot;)
index = np.argwhere(xy==1)
x = xy[index[0][0],:] #solution for the x boolean array
y = xy[:,index[0][1]] #solution for the y boolean array

Then the result is:

A @ w = 166.0
y @ A @ x = 166.0

The new formulation is equal to the old one so the problem is solved.

Edit: Shorter method

Similar to above, but we can reduce two lines of code by having variable w the same shape as matrix A and changing the form of the objective function to cp.sum(cp.multiply(A,w)):

w = cp.Variable(A.shape, boolean=True)
objective = cp.Maximize(cp.sum(cp.multiply(A,w)))
constraints = [cp.sum(w)==1] #constraint boolean array to only have only 1 non-zero value
prob = cp.Problem(objective,constraints)
prob.solve() #solve using cvxpy
index = np.argwhere(w.value==1)
x = w.value[index[0][0],:] #solution for the x boolean array
y = w.value[:,index[0][1]] #solution for the y boolean array

huangapple
  • 本文由 发表于 2023年4月6日 23:02:52
  • 转载请务必保留本文链接:https://go.coder-hub.com/75951009.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定