CVXPY声称矩阵不对称且不是半正定,尽管我的矩阵是对称且半正定。

huangapple go评论70阅读模式
英文:

CVXPY claims matrix is not symmetric and PSD despite my matrix being symmetric and PSD

问题

您的问题似乎与CVXPY和线性规划有关。以下是您提到的问题的主要问题:

  1. 矩阵不对称或半正定问题:在您的第一个代码块中,您使用CVXPY来解决二次规划问题。这需要输入的矩阵P是对称半正定的(PSD)。如果P不满足这些条件,CVXPY可能无法正确求解问题。

  2. 解决方案尝试:在第一个代码块中,您将问题定义为二次规划问题,并将其传递给CVXPY求解器。但是,您提到即使在圆锥部分为零的情况下也会得到“inf”的优化目标值。这可能是因为问题的某些设置或输入不正确。

  3. 线性规划问题的成功:您还提到,使用线性规划解决相似的问题可以获得最佳解。这表明问题可能与CVXPY的设置有关,或者可能与问题的形式有关。

要解决这些问题,您可以尝试以下步骤:

  • 确保输入矩阵P是对称半正定的。您可以使用NumPy的np.linalg.eigvalsh函数来检查矩阵是否具有正特征值。如果不是半正定的,您可能需要更正输入数据或问题设置。

  • 仔细检查CVXPY问题设置,确保约束和目标函数正确定义。您可以逐步构建问题,并在每个步骤中检查问题是否仍然有效。

  • 在尝试解决二次规划问题之前,确保您的输入数据和参数都被正确传递到函数中。确保矩阵尺寸和数据类型都正确。

  • 考虑调整优化问题的求解器和参数。有时,更改求解器或调整一些参数可以改善问题的收敛性。

  • 如果可能,与CVXPY的文档和示例一起使用,以确保您的问题设置符合CVXPY的期望。

请根据上述建议逐步检查和调整您的代码,以解决问题。如果问题仍然存在,您可能需要进一步检查和调试CVXPY设置以找出问题的根本原因。

英文:

I have a quadratic program (see below) that causes an error due to my quad_form matrix being not symmetric or PSD. Why is this happening? (Note that the issue is with my matrix P below so feel free to ignore the first block of code in the function.)

def run_QP(n, p, L, Y, X, sigma, B_prop):
  
  # Configuring our inputs to suit LP
  Y_LP = Y.flatten('F')
  X_LP = np.zeros((n*L,p*L))
  for l in range(L):
    X_LP[n*l:n*(l+1),p*l:p*(l+1)] = X
  C_LP = np.eye(p)
  for l in range(L-1):
    C_LP = np.concatenate((C_LP, np.eye(p)), axis=1)
  one_p = np.ones(p)

  # Setting the objective matrix and vector
  constant = 1/(p*cp.power(sigma,2))
  P = constant * (X_LP.T @ X_LP)
  I_pL = np.eye(p*L)
  temp_term = -1*cp.log(B_prop[0]) * (one_p.T @ I_pL[0:p,:])
  for l in range(1,L):
    I_pL_trun = I_pL[p*l:p*(l+1),:]
    temp_term -= cp.log(B_prop[l]) * (one_p.T @ I_pL_trun)
  q = -1*constant*(Y_LP.T@X_LP) + temp_term
  
  # Setting the equality constraints matrix
  A = np.concatenate((X_LP, C_LP), axis=0)
  # Setting the equality constraints vector
  b = np.concatenate((Y_LP, one_p), axis=0)

  # Define and solve the CVXPY problem.
  x = cp.Variable(p*L)
  prob = cp.Problem(cp.Minimize((1/2)*cp.quad_form(x, P) + q.T @ x),
                    [A @ x == b,])
  prob.solve()

  # Reconfiguring our outputs to suit pooled data
  B_QP_est = prob.value
  B_est = np.zeros((p,L))
  for l in range(L):
    B_col = B_QP_est[p*l:p*(l+1)]
    B_est[:,l] = B_col
  
  return B_est

B_bar_prob = [0.5,0.5]
L = 2
alpha = 0.5
p = 100
n = 20
sigma = 0.1

indices = np.random.choice(np.array([0, 1]), size=p, p=B_bar_prob)
B = np.eye(np.max(indices)+1)[indices]
X = binomial(1, alpha, (n, p))
Psi = normal(0, sigma, (n, L))
Y = np.dot(X, B) + Psi
B_prop = np.sum(B, axis=0) / n

B_hat = run_QP(n, p, L, Y, X, sigma, B_prop)

When I check for PSD using the following code (after swapping my cvxpy atomic functions out for numpy functions, I get true):

def isPSD(A, tol=1e-8):
  E = np.linalg.eigvalsh(A)
  return np.all(E > -tol)

Hence, I am very confused as to what is going on...

Edit: Thanks to the answer below from Michal Adamaszek, I have modified my program to work.

def run_QP(n, p, L, Y, X, sigma, B_prop):
  
  # Configuring our inputs to suit LP
  Y_LP = Y.flatten('F')
  X_LP = np.zeros((n*L,p*L))
  for l in range(L):
    X_LP[n*l:n*(l+1),p*l:p*(l+1)] = X
  C_LP = np.eye(p)
  for l in range(L-1):
    C_LP = np.concatenate((C_LP, np.eye(p)), axis=1)
  one_p = np.ones(p)

  # Setting the objective matrix and vector
  constant = 1/(p*(sigma**2))
  temp_term = np.zeros(p*L)
  for l in range(L):
    I_pL = np.eye(p*L)
    I_pL_trun = I_pL[p*l:p*(l+1),:]
    temp_term -= np.log(B_prop[l]) * np.dot(one_p.T,I_pL_trun)
  q = -1*constant*(Y_LP.T @ X_LP) + temp_term
  
  # Setting the equality constraints matrix
  A = np.concatenate((X_LP, C_LP), axis=0)
  # Setting the equality constraints vector
  b = np.concatenate((Y_LP, one_p), axis=0)

  # Define and solve the CVXPY problem.
  x = cp.Variable(p*L)
  objective = cp.Minimize((1/2)*constant*cp.sum_squares(X_LP @ x) + (q.T @ x))
  constraints = []
  constraints.append(A @ x == b)
  problem = cp.Problem(objective, constraints)
  problem.solve()
  print('optimal obj value:', problem.value)

  # Reconfiguring our outputs to suit pooled data
  B_QP_est = x.value
  B_est = np.zeros((p,L))
  for l in range(L):
    B_col = B_QP_est[p*l:p*(l+1)]
    B_est[:,l] = B_col
  
  return B_est

However, this always gives inf for the objective function, even when the conic part is zero. See the following output:

===============================================================================
                                     CVXPY                                     
                                     v1.3.1                                    
===============================================================================
(CVXPY) Jun 05 07:13:28 AM: Your problem has 200 variables, 1 constraints, and 0 parameters.
(CVXPY) Jun 05 07:13:28 AM: It is compliant with the following grammars: DCP, DQCP
(CVXPY) Jun 05 07:13:28 AM: (If you need to solve this problem multiple times, but with different data, consider using parameters.)
(CVXPY) Jun 05 07:13:28 AM: CVXPY will first compile your problem; then, it will invoke a numerical solver to obtain a solution.
-------------------------------------------------------------------------------
                                  Compilation                                  
-------------------------------------------------------------------------------
(CVXPY) Jun 05 07:13:28 AM: Compiling problem (target solver=OSQP).
(CVXPY) Jun 05 07:13:28 AM: Reduction chain: CvxAttr2Constr -> Qp2SymbolicQp -> QpMatrixStuffing -> OSQP
(CVXPY) Jun 05 07:13:28 AM: Applying reduction CvxAttr2Constr
(CVXPY) Jun 05 07:13:28 AM: Applying reduction Qp2SymbolicQp
(CVXPY) Jun 05 07:13:28 AM: Applying reduction QpMatrixStuffing
(CVXPY) Jun 05 07:13:28 AM: Applying reduction OSQP
(CVXPY) Jun 05 07:13:28 AM: Finished problem compilation (took 8.957e-02 seconds).
-------------------------------------------------------------------------------
                                Numerical solver                               
-------------------------------------------------------------------------------
(CVXPY) Jun 05 07:13:28 AM: Invoking solver OSQP  to obtain a solution.
-----------------------------------------------------------------
           OSQP v0.6.2  -  Operator Splitting QP Solver
              (c) Bartolomeo Stellato,  Goran Banjac
        University of Oxford  -  Stanford University 2021
-----------------------------------------------------------------
problem:  variables n = 240, constraints m = 180
          nnz(P) + nnz(A) = 4156
settings: linear system solver = qdldl,
          eps_abs = 1.0e-05, eps_rel = 1.0e-05,
          eps_prim_inf = 1.0e-04, eps_dual_inf = 1.0e-04,
          rho = 1.00e-01 (adaptive),
          sigma = 1.00e-06, alpha = 1.60, max_iter = 10000
          check_termination: on (interval 25),
          scaling: on, scaled_termination: off
          warm start: on, polish: on, time_limit: off

iter   objective    pri res    dua res    rho        time
   1  -9.5106e-01   3.09e+01   1.59e+07   1.00e-01   6.26e-03s
  50   1.0000e+30   2.49e-02   2.14e-04   1.00e-01   8.22e-03s

status:               primal infeasible
number of iterations: 50
run time:             8.35e-03s
optimal rho estimate: 3.87e+00

So when the conic part is zero, a simple linear program below gets me the optimal solution:

from scipy.optimize import linprog

def run_LP(n, p, L, Y, X, B_prop):
  
  # Configuring our inputs to suit LP
  Y_LP = Y.flatten('F')
  X_LP = np.zeros((n*L,p*L))
  for l in range(L):
    X_LP[n*l:n*(l+1),p*l:p*(l+1)] = X
  C_LP = np.eye(p)
  for l in range(L-1):
    C_LP = np.concatenate((C_LP, np.eye(p)), axis=1)
  one_p = np.ones(p)

  # Setting the objective vector
  c = np.zeros(p*L)
  for l in range(L):
    I_pL = np.eye(p*L)
    I_pL_trun = I_pL[p*l:p*(l+1),:]
    c -= np.log(B_prop[l]) * np.dot(one_p.T,I_pL_trun)
  
  # Setting the equality constraints matrix
  A = np.concatenate((X_LP, C_LP), axis=0)
  # Setting the equality constraints vector
  b = np.concatenate((Y_LP, one_p), axis=0)

  # Solving linear programming problem
  res = linprog(c, A_eq=A, b_eq=b)
  print("Linear program:", res.message)

  # Reconfiguring our outputs to suit pooled data
  B_LP_est = res.x
  B_est = np.zeros((p,L))
  for l in range(L):
    B_col = B_LP_est[p*l:p*(l+1)]
    B_est[:,l] = B_col
  
  return B_est

Why is this happening?

答案1

得分: 1

CVXPY内部执行了与矩阵分解有关的其他操作,因此如果使用“eigvalsh”进行检查后发现您的情况处于边缘状态,CVXPY中使用的内部方法可能会倾向于另一种方式。

更重要的是,如果您已经将问题制定为锥形问题,那么没有必要将其再制定为二次规划问题。这是标准的SOCP重新制定。具体来说,

(1/2)*cp.quad_form(x, P)

在您的情况下等同于

(1/2) * 常数 * cp.sum_squares(X_LP @ x)

因为

x.T@P@x = 常数 * norm(X_LP@x)^2

这样可以避免先构建“P=X_LP.T@X_LP”,然后CVXPY必须再次进行因子分解的迂回方式。这还可以通过构造方式确保问题是凸问题,因此不会出现凸性检查方面的数值问题。更多信息请参阅https://docs.mosek.com/modeling-cookbook/cqo.html#conic-quadratic-modeling

英文:

Internally CVXPY does something else to factorize the matrix, so if a check with eigvalsh reveals you are just borderline good, then the internal method used in CVXPY can tip the balance the other way.

More importantly, there is no need to formulate your problem as a QP if you already have it on conic form to start with. This is a standard SOCP reformulation. Namely,

(1/2)*cp.quad_form(x, P)

is in your case equivalent to

(1/2) * constant * cp.sum_squares(X_LP @ x)

because

x.T@P@x = constant * norm(X_LP@x)^2

This way you avoid the roundabout way where you first construct P=X_LP.T@X_LP and then CVXPY has to work to factorize it back again. You also get a problem convex by construction, so no numerical issues will come in the way of the convexity check. More about this in https://docs.mosek.com/modeling-cookbook/cqo.html#conic-quadratic-modeling

huangapple
  • 本文由 发表于 2023年6月5日 13:43:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/76403747.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定