英文:
Two matrix multiplication operations in CVXPY gives UNKNOWN curvature and makes the problem not DCP
问题
I want to solve the following optimization problem using cvxpy:
最大化(y @ A @ x)
其中A是一个大小为(n, m)的正值元素矩阵;
x是长度为m的布尔向量;
y是长度为n的布尔向量。
约束条件为:
sum(x)==1
和 sum(y)==1
例如,如果 A = [[ 87, 96, 127, 46], [155, 166, 92, 11], [111, 163, 126, 112]]
则解为:
x = [0,1,0,0]
y = [0,1,0]
导致:y @ A @ x = 166
,这是在约束条件下的最大值。
然而,cvxpy报错并显示:
DCPError: 问题不符合DCP规则。具体而言:
目标函数不是DCP。以下子表达式也不是:
var27 @ [[ 87. 96. 127. 46.]
[155. 166. 92. 11.]
[111. 163. 126. 112.]] @ var26
表达式y @ A @ x 被评估为Expression(UNKNOWN, NONNEGATIVE, ()),表明曲率未知,尽管A @ x和y @ A都是AFFINE和NONNEGATIVE。
对于这个看似简单的问题,使用cvxpy要如何解决呢?
我的代码是:
import cvxpy as cp
import numpy as np
rows = 3
cols = 4
A = np.random.randint(0,255,size=(rows,cols))
x = cp.Variable(cols, boolean=True)
y = cp.Variable(rows, boolean=True)
objective = cp.Maximize(y @ A @ x)
constraints = [cp.sum(x)==1,cp.sum(y)==1]
prob = cp.Problem(objective,constraints)
prob.solve()
我尝试对randint生成的矩阵添加属性,如下所示:
a = np.random.randint(0,255,size=(rows,cols))
A = cp.Variable((rows,cols),nonneg=True)
A.value = A.project(a)
我尝试将它变成正定矩阵,尝试设置pos=True。
我还尝试设置prob.solve(qcp=True)
。
但所有这些仍然导致上述关于DCP的错误消息。
英文:
I want to solve the following optimization problem using cvxpy:
Maximize(y @ A @ x)
where A is a (n,m)-size matrix of positive valued elements;
x is boolean vector of is length m;
y is boolean vector is length n.
The constraints are:
sum(x)==1
and sum(y)==1
For example, if A = [[ 87, 96, 127, 46], [155, 166, 92, 11], [111, 163, 126, 112]]
then the solution is:
x = [0,1,0,0]
y = [0,1,0]
resulting in: y @ A @ x = 166
, which is the largest value given the constraints.
However, cvxpy throws an error and says:
> DCPError: Problem does not follow DCP rules. Specifically:
> The objective is not DCP. Its following subexpressions are not:
> var27 @ [[ 87. 96. 127. 46.]
> [155. 166. 92. 11.]
> [111. 163. 126. 112.]] @ var26
The expression y @ A @ x is evaluated to Expression(UNKNOWN, NONNEGATIVE, ()), suggesting the curvature is unknown even though A @ x and y @ A are both AFFINE and NONNEGATIVE.
What is the workaround for this seemingly simple problem to make it work using cvxpy?
My code is:
import cvxpy as cp
import numpy as np
rows = 3
cols = 4
A = np.random.randint(0,255,size=(rows,cols))
x = cp.Variable(cols, boolean=True)
y = cp.Variable(rows, boolean=True)
objective = cp.Maximize(y @ A @ x)
constraints = [cp.sum(x)==1,cp.sum(y)==1]
prob = cp.Problem(objective,constraints)
prob.solve()
I tried doing the following to give an attribute to the randint matrix:
a = np.random.randint(0,255,size=(rows,cols))
A = cp.Variable((rows,cols),nonneg=True)
A.value = A.project(a)
I tried making it a square matrix with PSD=True. Tried, pos=True.
I also tried setting prob.solve(qcp=True).
All of these still resulted in the above error message about DCP.
答案1
得分: 1
Nonlinear Optimisation
import numpy as np
from scipy.optimize import minimize, LinearConstraint, Bounds
A = np.array([
[ 87, 96, 127, 46],
[155, 166, 92, 11],
[111, 163, 126, 112],
])
n, m = A.shape
def objective(xy: np.ndarray) -> float:
x = xy[:m]
y = xy[-n:]
return -y @ A @ x
Ac = np.zeros((2, m+n))
Ac[0, :m] = 1
Ac[1, -n:] = 1
x0 = np.empty(m + n)
x0[:m] = 1/m
x0[-n:] = 1/n
result = minimize(
fun=objective, x0=x0,
bounds=Bounds(lb=np.zeros_like(x0), ub=np.ones_like(x0)),
constraints=LinearConstraint(A=Ac, lb=(1,1), ub=(1,1)),
)
print(result)
print('x ~', result.x[:m].round(3))
print('y ~', result.x[-n:].round(3))
print('Constraint residual:', Ac.dot(result.x) - 1)
Linear Program
import numpy as np
from scipy.optimize import milp, Bounds, LinearConstraint
A = np.array([
( 87, 96, 127, 46),
(155, 166, 92, 11),
(111, 163, 126, 112),
])
n, m = A.shape
# Objective coefficients: maximize the n*m implied products in A
c = np.full(n + m + n*m, -1)
c[:n+m] = 0
# y, x are integral
integrality = np.zeros(n + m + n*m)
integrality[:n+m] = 1
# y, x are binary; Axy vary between 0 and each A
lb = np.zeros(n + m + n*m)
ub = np.ones(n + m + n*m)
ub[n+m:] = A.ravel()
# constraints: for each Axy, Axy <= A*y 0 <= A*y - Axy
# Axy <= A*x 0 <= A*x - Axy
Acy = np.repeat(np.eye(n), m, axis=0); Acy[Acy.nonzero()] = A.ravel()
Acx = np.tile(np eye(m), (n, 1)); Acx[Acx.nonzero()] = A.ravel()
Auy = np.zeros_like(lb); Auy[:n] = 1
Aux = np.zeros_like(lb); Aux[n: n+m] = 1
Ac = np.vstack((
np.hstack((Acy, np.zeros((m*n, m)), -np.eye(m*n))),
np.hstack((np.zeros((m*n, n)), Acx, -np.eye(m*n))),
Auy, Aux,
))
lbc = np.ones(2*m*n + 2); lbc[:-2] = 0
ubc = np.ones_like(lbc); ubc[:-2] = np.inf
result = milp(
c=c, integrality=integrality,
bounds=Bounds(lb=lb, ub=ub),
constraints=LinearConstraint(A=Ac, lb=lbc, ub=ubc),
options={'disp': True},
)
print(result)
print('y =', result.x[:n])
print('x =', result.x[n: n+m])
print('Axy =')
print(result.x[-n*m:].reshape((n, m)))
Direct Solution
import numpy as np
A = np.array([
( 87, 96, 127, 46),
(155, 166, 92, 11),
(111, 163, 126, 112),
])
n, m = A.shape
i = A.argmax()
y = np.zeros(n, dtype=int)
x = np.zeros(m, dtype=int)
y[i // m] = 1
x[i % m] = 1
print(y)
print(x)
英文:
Nonlinear Optimisation
If you decide not to linearise and you're OK carrying the risk of inexact solutions for some classes of A
, then
import numpy as np
from scipy.optimize import minimize, LinearConstraint, Bounds
A = np.array([
[ 87, 96, 127, 46],
[155, 166, 92, 11],
[111, 163, 126, 112],
])
n, m = A.shape
def objective(xy: np.ndarray) -> float:
x = xy[:m]
y = xy[-n:]
return -y @ A @ x
Ac = np.zeros((2, m+n))
Ac[0, :m] = 1
Ac[1, -n:] = 1
x0 = np.empty(m + n)
x0[:m] = 1/m
x0[-n:] = 1/n
result = minimize(
fun=objective, x0=x0,
bounds=Bounds(lb=np.zeros_like(x0), ub=np.ones_like(x0)),
constraints=LinearConstraint(A=Ac, lb=(1,1), ub=(1,1)),
)
print(result)
print('x ~', result.x[:m].round(3))
print('y ~', result.x[-n:].round(3))
print('Constraint residual:', Ac.dot(result.x) - 1)
message: Optimization terminated successfully
success: True
status: 0
fun: -166.0
x: [ 0.000e+00 1.000e+00 0.000e+00 0.000e+00 0.000e+00
1.000e+00 0.000e+00]
nit: 7
jac: [-1.550e+02 -1.660e+02 -9.200e+01 -1.100e+01 -9.600e+01
-1.660e+02 -1.630e+02]
nfev: 48
njev: 6
x ~ [0. 1. 0. 0.]
y ~ [0. 1. 0.]
Constraint residual: [0. 0.]
which matches your demonstrated output.
Linear Program
Framed as a linear program,
import numpy as np
from scipy.optimize import milp, Bounds, LinearConstraint
A = np.array([
( 87, 96, 127, 46),
(155, 166, 92, 11),
(111, 163, 126, 112),
])
n, m = A.shape
# Objective coefficients: maximize the n*m implied products in A
c = np.full(n + m + n*m, -1)
c[:n+m] = 0
# y, x are integral
integrality = np.zeros(n + m + n*m)
integrality[:n+m] = 1
# y, x are binary; Axy vary between 0 and each A
lb = np.zeros(n + m + n*m)
ub = np.ones(n + m + n*m)
ub[n+m:] = A.ravel()
# constraints: for each Axy, Axy <= A*y 0 <= A*y - Axy
# Axy <= A*x 0 <= A*x - Axy
Acy = np.repeat(np.eye(n), m, axis=0); Acy[Acy.nonzero()] = A.ravel()
Acx = np.tile(np.eye(m), (n, 1)); Acx[Acx.nonzero()] = A.ravel()
Auy = np.zeros_like(lb); Auy[:n] = 1
Aux = np.zeros_like(lb); Aux[n: n+m] = 1
Ac = np.vstack((
np.hstack((Acy, np.zeros((m*n, m)), -np.eye(m*n))),
np.hstack((np.zeros((m*n, n)), Acx, -np.eye(m*n))),
Auy, Aux,
))
lbc = np.ones(2*m*n + 2); lbc[:-2] = 0
ubc = np.ones_like(lbc); ubc[:-2] = np.inf
result = milp(
c=c, integrality=integrality,
bounds=Bounds(lb=lb, ub=ub),
constraints=LinearConstraint(A=Ac, lb=lbc, ub=ubc),
options={'disp': True},
)
print(result)
print('y =', result.x[:n])
print('x =', result.x[n: n+m])
print('Axy =')
print(result.x[-n*m:].reshape((n, m)))
Running HiGHS 1.2.0 [date: 2021-07-09, git hash: n/a]
Copyright (c) 2022 ERGO-Code under MIT licence terms
Presolving model
26 rows, 19 cols, 55 nonzeros
24 rows, 17 cols, 58 nonzeros
Objective function is integral with scale 1
Solving MIP model with:
24 rows
17 cols (5 binary, 0 integer, 12 implied int., 0 continuous)
58 nonzeros
Nodes | B&B Tree | Objective Bounds | Dynamic Constraints | Work
Proc. InQueue | Leaves Expl. | BestBound BestSol Gap | Cuts InLp Confl. | LpIters Time
0 0 0 0.00% -1292 inf inf 0 0 0 0 0.0s
R 0 0 0 0.00% -374.3333333 -96 289.93% 0 0 0 17 0.0s
L 0 0 0 0.00% -206.6666667 -163 26.79% 19 4 7 39 0.0s
H 0 0 0 0.00% -186.2666667 -166 12.21% 21 6 17 46 0.0s
40.0% inactive integer columns, restarting
Model after restart has 11 rows, 8 cols (3 bin., 0 int., 5 impl., 0 cont.), and 24 nonzeros
0 0 0 0.00% -186.2666667 -166 12.21% 2 0 0 56 0.0s
Solving report
Status Optimal
Primal bound -166
Dual bound -166
Gap 0% (tolerance: 0.01%)
Solution status feasible
-166 (objective)
0 (bound viol.)
0 (int. viol.)
0 (row viol.)
Timing 0.03 (total)
0.00 (presolve)
0.00 (postsolve)
Nodes 1
LP iterations 62 (total)
0 (strong br.)
29 (separation)
10 (heuristics)
message: Optimization terminated successfully. (HiGHS Status 7: Optimal)
success: True
status: 0
fun: -165.99999999999997
x: [ 0.000e+00 1.000e+00 ... 0.000e+00 -0.000e+00]
mip_node_count: 1
mip_dual_bound: -165.99999999999997
mip_gap: 0.0
y = [ 0. 1. -0.]
x = [ 0. 1. -0. 0.]
Axy =
[[ 0.00000000e+00 -4.73695157e-15 0.00000000e+00 -0.00000000e+00]
[-0.00000000e+00 1.66000000e+02 0.00000000e+00 -0.00000000e+00]
[-0.00000000e+00 -0.00000000e+00 0.00000000e+00 -0.00000000e+00]]
Direct Solution
I don't think any of the above is necessary. Since there will only be one nonzero in y and x, then
import numpy as np
A = np.array([
( 87, 96, 127, 46),
(155, 166, 92, 11),
(111, 163, 126, 112),
])
n, m = A.shape
i = A.argmax()
y = np.zeros(n, dtype=int)
x = np.zeros(m, dtype=int)
y[i // m] = 1
x[i % m] = 1
print(y)
print(x)
[0 1 0]
[0 1 0 0]
答案2
得分: 0
I can linearize the problem by solving for a single variable w that encodes the values of x and y. Since the constraint for both x and y is that there is only 1 non-zero element and y @ A @ x
only chooses 1 element from matrix A - then we can solve for A_flat @ w
, where A_flat
is the flattened matrix and the constraint for w is that there is only one non-zero element. In this way the problem can be solved using cvxpy. Solving for the example in the question:
A = np.asarray([[87, 96, 127, 46], [155, 166, 92, 11], [111, 163, 126, 112]])
Af = cp.vec(A) #Flattened into vector array
w = cp.Variable(A.size, boolean=True) #encodes x and y boolean array values
objective = cp.Maximize(Af @ w) #Equivalent to y @ A @ x
constraints = [cp.sum(w)==1] #constraint boolean array to only have only 1 non-zero value
prob = cp.Problem(objective,constraints)
prob.solve() #solve using cvxpy
xy = np.reshape(w.value,(A.shape[0],A.shape[1]),order="F")
index = np.argwhere(xy==1)
x = xy[index[0][0],:] #solution for the x boolean array
y = xy[:,index[0][1]] #solution for the y boolean array
Then the result is:
A @ w = 166.0
y @ A @ x = 166.0
The new formulation is equal to the old one so the problem is solved.
Edit: Shorter method
Similar to above, but we can reduce two lines of code by having variable w the same shape as matrix A and changing the form of the objective function to cp.sum(cp.multiply(A,w))
:
w = cp.Variable(A.shape, boolean=True)
objective = cp.Maximize(cp.sum(cp.multiply(A,w)))
constraints = [cp.sum(w)==1] #constraint boolean array to only have only 1 non-zero value
prob = cp.Problem(objective,constraints)
prob.solve() #solve using cvxpy
index = np.argwhere(w.value==1)
x = w.value[index[0][0],:] #solution for the x boolean array
y = w.value[:,index[0][1]] #solution for the y boolean array
英文:
I can linearize the problem by solving for a single variable w that encodes the values of x and y. Since the constraint for both x and y is that there is only 1 non-zero element and y @ A @ x
only chooses 1 element from matrix A - then we can solve for A_flat @ w
, where A_flat
is the flattened matrix and the constraint for w is that there is only one non-zero element. In this way the problem can be solved using cvxpy. Solving for the example in the question:
A = np.asarray([[87, 96, 127, 46], [155, 166, 92, 11], [111, 163, 126, 112]])
Af = cp.vec(A) #Flattened into vector array
w = cp.Variable(A.size, boolean=True) #encodes x and y boolean array values
objective = cp.Maximize(Af @ w) #Equivalent to y @ A @ x
constraints = [cp.sum(w)==1] #constraint boolean array to only have only 1 non-zero value
prob = cp.Problem(objective,constraints)
prob.solve() #solve using cvxpy
xy = np.reshape(w.value,(A.shape[0],A.shape[1]),order="F")
index = np.argwhere(xy==1)
x = xy[index[0][0],:] #solution for the x boolean array
y = xy[:,index[0][1]] #solution for the y boolean array
Then the result is:
A @ w = 166.0
y @ A @ x = 166.0
The new formulation is equal to the old one so the problem is solved.
Edit: Shorter method
Similar to above, but we can reduce two lines of code by having variable w the same shape as matrix A and changing the form of the objective function to cp.sum(cp.multiply(A,w))
:
w = cp.Variable(A.shape, boolean=True)
objective = cp.Maximize(cp.sum(cp.multiply(A,w)))
constraints = [cp.sum(w)==1] #constraint boolean array to only have only 1 non-zero value
prob = cp.Problem(objective,constraints)
prob.solve() #solve using cvxpy
index = np.argwhere(w.value==1)
x = w.value[index[0][0],:] #solution for the x boolean array
y = w.value[:,index[0][1]] #solution for the y boolean array
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论