英文:
CVXPY claims matrix is not symmetric and PSD despite my matrix being symmetric and PSD
问题
您的问题似乎与CVXPY和线性规划有关。以下是您提到的问题的主要问题:
-
矩阵不对称或半正定问题:在您的第一个代码块中,您使用CVXPY来解决二次规划问题。这需要输入的矩阵P是对称半正定的(PSD)。如果P不满足这些条件,CVXPY可能无法正确求解问题。
-
解决方案尝试:在第一个代码块中,您将问题定义为二次规划问题,并将其传递给CVXPY求解器。但是,您提到即使在圆锥部分为零的情况下也会得到“inf”的优化目标值。这可能是因为问题的某些设置或输入不正确。
-
线性规划问题的成功:您还提到,使用线性规划解决相似的问题可以获得最佳解。这表明问题可能与CVXPY的设置有关,或者可能与问题的形式有关。
要解决这些问题,您可以尝试以下步骤:
-
确保输入矩阵P是对称半正定的。您可以使用NumPy的
np.linalg.eigvalsh
函数来检查矩阵是否具有正特征值。如果不是半正定的,您可能需要更正输入数据或问题设置。 -
仔细检查CVXPY问题设置,确保约束和目标函数正确定义。您可以逐步构建问题,并在每个步骤中检查问题是否仍然有效。
-
在尝试解决二次规划问题之前,确保您的输入数据和参数都被正确传递到函数中。确保矩阵尺寸和数据类型都正确。
-
考虑调整优化问题的求解器和参数。有时,更改求解器或调整一些参数可以改善问题的收敛性。
-
如果可能,与CVXPY的文档和示例一起使用,以确保您的问题设置符合CVXPY的期望。
请根据上述建议逐步检查和调整您的代码,以解决问题。如果问题仍然存在,您可能需要进一步检查和调试CVXPY设置以找出问题的根本原因。
英文:
I have a quadratic program (see below) that causes an error due to my quad_form matrix being not symmetric or PSD. Why is this happening? (Note that the issue is with my matrix P below so feel free to ignore the first block of code in the function.)
def run_QP(n, p, L, Y, X, sigma, B_prop):
# Configuring our inputs to suit LP
Y_LP = Y.flatten('F')
X_LP = np.zeros((n*L,p*L))
for l in range(L):
X_LP[n*l:n*(l+1),p*l:p*(l+1)] = X
C_LP = np.eye(p)
for l in range(L-1):
C_LP = np.concatenate((C_LP, np.eye(p)), axis=1)
one_p = np.ones(p)
# Setting the objective matrix and vector
constant = 1/(p*cp.power(sigma,2))
P = constant * (X_LP.T @ X_LP)
I_pL = np.eye(p*L)
temp_term = -1*cp.log(B_prop[0]) * (one_p.T @ I_pL[0:p,:])
for l in range(1,L):
I_pL_trun = I_pL[p*l:p*(l+1),:]
temp_term -= cp.log(B_prop[l]) * (one_p.T @ I_pL_trun)
q = -1*constant*(Y_LP.T@X_LP) + temp_term
# Setting the equality constraints matrix
A = np.concatenate((X_LP, C_LP), axis=0)
# Setting the equality constraints vector
b = np.concatenate((Y_LP, one_p), axis=0)
# Define and solve the CVXPY problem.
x = cp.Variable(p*L)
prob = cp.Problem(cp.Minimize((1/2)*cp.quad_form(x, P) + q.T @ x),
[A @ x == b,])
prob.solve()
# Reconfiguring our outputs to suit pooled data
B_QP_est = prob.value
B_est = np.zeros((p,L))
for l in range(L):
B_col = B_QP_est[p*l:p*(l+1)]
B_est[:,l] = B_col
return B_est
B_bar_prob = [0.5,0.5]
L = 2
alpha = 0.5
p = 100
n = 20
sigma = 0.1
indices = np.random.choice(np.array([0, 1]), size=p, p=B_bar_prob)
B = np.eye(np.max(indices)+1)[indices]
X = binomial(1, alpha, (n, p))
Psi = normal(0, sigma, (n, L))
Y = np.dot(X, B) + Psi
B_prop = np.sum(B, axis=0) / n
B_hat = run_QP(n, p, L, Y, X, sigma, B_prop)
When I check for PSD using the following code (after swapping my cvxpy atomic functions out for numpy functions, I get true):
def isPSD(A, tol=1e-8):
E = np.linalg.eigvalsh(A)
return np.all(E > -tol)
Hence, I am very confused as to what is going on...
Edit: Thanks to the answer below from Michal Adamaszek, I have modified my program to work.
def run_QP(n, p, L, Y, X, sigma, B_prop):
# Configuring our inputs to suit LP
Y_LP = Y.flatten('F')
X_LP = np.zeros((n*L,p*L))
for l in range(L):
X_LP[n*l:n*(l+1),p*l:p*(l+1)] = X
C_LP = np.eye(p)
for l in range(L-1):
C_LP = np.concatenate((C_LP, np.eye(p)), axis=1)
one_p = np.ones(p)
# Setting the objective matrix and vector
constant = 1/(p*(sigma**2))
temp_term = np.zeros(p*L)
for l in range(L):
I_pL = np.eye(p*L)
I_pL_trun = I_pL[p*l:p*(l+1),:]
temp_term -= np.log(B_prop[l]) * np.dot(one_p.T,I_pL_trun)
q = -1*constant*(Y_LP.T @ X_LP) + temp_term
# Setting the equality constraints matrix
A = np.concatenate((X_LP, C_LP), axis=0)
# Setting the equality constraints vector
b = np.concatenate((Y_LP, one_p), axis=0)
# Define and solve the CVXPY problem.
x = cp.Variable(p*L)
objective = cp.Minimize((1/2)*constant*cp.sum_squares(X_LP @ x) + (q.T @ x))
constraints = []
constraints.append(A @ x == b)
problem = cp.Problem(objective, constraints)
problem.solve()
print('optimal obj value:', problem.value)
# Reconfiguring our outputs to suit pooled data
B_QP_est = x.value
B_est = np.zeros((p,L))
for l in range(L):
B_col = B_QP_est[p*l:p*(l+1)]
B_est[:,l] = B_col
return B_est
However, this always gives inf
for the objective function, even when the conic part is zero. See the following output:
===============================================================================
CVXPY
v1.3.1
===============================================================================
(CVXPY) Jun 05 07:13:28 AM: Your problem has 200 variables, 1 constraints, and 0 parameters.
(CVXPY) Jun 05 07:13:28 AM: It is compliant with the following grammars: DCP, DQCP
(CVXPY) Jun 05 07:13:28 AM: (If you need to solve this problem multiple times, but with different data, consider using parameters.)
(CVXPY) Jun 05 07:13:28 AM: CVXPY will first compile your problem; then, it will invoke a numerical solver to obtain a solution.
-------------------------------------------------------------------------------
Compilation
-------------------------------------------------------------------------------
(CVXPY) Jun 05 07:13:28 AM: Compiling problem (target solver=OSQP).
(CVXPY) Jun 05 07:13:28 AM: Reduction chain: CvxAttr2Constr -> Qp2SymbolicQp -> QpMatrixStuffing -> OSQP
(CVXPY) Jun 05 07:13:28 AM: Applying reduction CvxAttr2Constr
(CVXPY) Jun 05 07:13:28 AM: Applying reduction Qp2SymbolicQp
(CVXPY) Jun 05 07:13:28 AM: Applying reduction QpMatrixStuffing
(CVXPY) Jun 05 07:13:28 AM: Applying reduction OSQP
(CVXPY) Jun 05 07:13:28 AM: Finished problem compilation (took 8.957e-02 seconds).
-------------------------------------------------------------------------------
Numerical solver
-------------------------------------------------------------------------------
(CVXPY) Jun 05 07:13:28 AM: Invoking solver OSQP to obtain a solution.
-----------------------------------------------------------------
OSQP v0.6.2 - Operator Splitting QP Solver
(c) Bartolomeo Stellato, Goran Banjac
University of Oxford - Stanford University 2021
-----------------------------------------------------------------
problem: variables n = 240, constraints m = 180
nnz(P) + nnz(A) = 4156
settings: linear system solver = qdldl,
eps_abs = 1.0e-05, eps_rel = 1.0e-05,
eps_prim_inf = 1.0e-04, eps_dual_inf = 1.0e-04,
rho = 1.00e-01 (adaptive),
sigma = 1.00e-06, alpha = 1.60, max_iter = 10000
check_termination: on (interval 25),
scaling: on, scaled_termination: off
warm start: on, polish: on, time_limit: off
iter objective pri res dua res rho time
1 -9.5106e-01 3.09e+01 1.59e+07 1.00e-01 6.26e-03s
50 1.0000e+30 2.49e-02 2.14e-04 1.00e-01 8.22e-03s
status: primal infeasible
number of iterations: 50
run time: 8.35e-03s
optimal rho estimate: 3.87e+00
So when the conic part is zero, a simple linear program below gets me the optimal solution:
from scipy.optimize import linprog
def run_LP(n, p, L, Y, X, B_prop):
# Configuring our inputs to suit LP
Y_LP = Y.flatten('F')
X_LP = np.zeros((n*L,p*L))
for l in range(L):
X_LP[n*l:n*(l+1),p*l:p*(l+1)] = X
C_LP = np.eye(p)
for l in range(L-1):
C_LP = np.concatenate((C_LP, np.eye(p)), axis=1)
one_p = np.ones(p)
# Setting the objective vector
c = np.zeros(p*L)
for l in range(L):
I_pL = np.eye(p*L)
I_pL_trun = I_pL[p*l:p*(l+1),:]
c -= np.log(B_prop[l]) * np.dot(one_p.T,I_pL_trun)
# Setting the equality constraints matrix
A = np.concatenate((X_LP, C_LP), axis=0)
# Setting the equality constraints vector
b = np.concatenate((Y_LP, one_p), axis=0)
# Solving linear programming problem
res = linprog(c, A_eq=A, b_eq=b)
print("Linear program:", res.message)
# Reconfiguring our outputs to suit pooled data
B_LP_est = res.x
B_est = np.zeros((p,L))
for l in range(L):
B_col = B_LP_est[p*l:p*(l+1)]
B_est[:,l] = B_col
return B_est
Why is this happening?
答案1
得分: 1
CVXPY内部执行了与矩阵分解有关的其他操作,因此如果使用“eigvalsh”进行检查后发现您的情况处于边缘状态,CVXPY中使用的内部方法可能会倾向于另一种方式。
更重要的是,如果您已经将问题制定为锥形问题,那么没有必要将其再制定为二次规划问题。这是标准的SOCP重新制定。具体来说,
(1/2)*cp.quad_form(x, P)
在您的情况下等同于
(1/2) * 常数 * cp.sum_squares(X_LP @ x)
因为
x.T@P@x = 常数 * norm(X_LP@x)^2
这样可以避免先构建“P=X_LP.T@X_LP”,然后CVXPY必须再次进行因子分解的迂回方式。这还可以通过构造方式确保问题是凸问题,因此不会出现凸性检查方面的数值问题。更多信息请参阅https://docs.mosek.com/modeling-cookbook/cqo.html#conic-quadratic-modeling
英文:
Internally CVXPY does something else to factorize the matrix, so if a check with eigvalsh
reveals you are just borderline good, then the internal method used in CVXPY can tip the balance the other way.
More importantly, there is no need to formulate your problem as a QP if you already have it on conic form to start with. This is a standard SOCP reformulation. Namely,
(1/2)*cp.quad_form(x, P)
is in your case equivalent to
(1/2) * constant * cp.sum_squares(X_LP @ x)
because
x.T@P@x = constant * norm(X_LP@x)^2
This way you avoid the roundabout way where you first construct P=X_LP.T@X_LP
and then CVXPY has to work to factorize it back again. You also get a problem convex by construction, so no numerical issues will come in the way of the convexity check. More about this in https://docs.mosek.com/modeling-cookbook/cqo.html#conic-quadratic-modeling
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论