“Maximum Variance Unfolding with CVXPY” 可以翻译为 “使用CVXPY进行最大方差展开”。

huangapple go评论106阅读模式
英文:

Maximum Variance Unfolding with CVXPY

问题

我试图复现这篇论文(Weinberger和Saul,2004,doi/10.5555/1597348.1597471)中的结果,其中作者使用最大方差展开(MVU)来学习几个分布的低维表示。MVU的一般问题在丢弃秩约束后如下所示:

这里,'G'是数据的内积矩阵。每个'k'对应一个数据点,每个'l'对应其邻居之一。然后,'e'向量是“指示向量”,除了下标中的一个条目以外,所有条目都为零。第二个约束将数据置于零中心;我在我的公式中忽略了这个约束,以使其尽可能简单。最后,'G'必须是半正定的。

我试图在一组包含相同逐渐旋转的茶壶的400张图像(76x101像素)上复制上述论文中的实验。我首先导入数据和相关的库:

然后,我构建每个数据点的k个最近邻的邻接矩阵表示:

最后,我尝试使用CVXPY执行MVU,以发现这些数据的低维嵌入:

然而,CVXPY告诉我问题是“不可行的”。如果我移除约束,求解器会说问题是“无界的”,这是有道理的,因为我正在最大化决策变量的凸函数。那么,我的问题是,在MVU的制定中,我犯了什么错误?谢谢您!我在上面的代码中省略了一个验证,即邻接图不是断开的。为此,我只需对邻接图执行DFS,并断言访问的节点数等于数据中的行数。对于'n=4'个邻居,这个断言成立。

英文:

I am trying to reproduce the results from this paper (Weinberger and Saul, 2004, doi/10.5555/1597348.1597471), in which the authors use Maximum Variance Unfolding (MVU) to learn a low dimensional representation of a few distributions. The general problem of MVU, after dropping the rank constraint, is as follows:
“Maximum Variance Unfolding with CVXPY” 可以翻译为 “使用CVXPY进行最大方差展开”。
Here, G is the inner product matrix of the data. Each k corresponds to a data point, and each l corresponds to one of its neighbors. Then, the e vectors are "indicator vectors" that have zeros in all entries except for the one in the index in the subscript. The second constraint zero-centers the data; I ignored this constraint in my formulation so as to keep it as simple as possible. Finally, G must be positive semidefinite.

I am trying to replicate the experiments in the paper above on a dataset which consists of 400 images (76x101 pixels) of the same progressively rotated teapot. I start by importing the data and relevant libraries:

  1. import scipy.io
  2. import cvxpy as cvx
  3. import numpy as np
  4. from sklearn.neighbors import NearestNeighbors
  5. data = scipy.io.loadmat('teapots.mat')
  6. data = data["Input"][0][0][0]
  7. # make it (400 x 23028)
  8. data = data.T.astype(np.float64)

Then, I build an adjacency matrix representation of the k-nearest neighbors of each data point:

  1. n_points = data.shape[0]
  2. # create sparse matrix (400 x 400) with the connections
  3. nn = NearestNeighbors(n_neighbors=4).fit(data)
  4. nn = nn.kneighbors_graph(data).todense()
  5. nn = np.array(nn)

Finally, I try to perform MVU to discover low dimensional embeddings for these data using CVXPY:

  1. # inner product matrix of the original data
  2. X = cvx.Constant(data.dot(data.T))
  3. # inner product matrix of the projected points; PSD constrains G to be both PSD and symmetric
  4. G = cvx.Variable((n_points, n_points), PSD=True)
  5. G.value = np.zeros((n_points, n_points))
  6. # spread out points in target manifold
  7. objective = cvx.Maximize(cvx.trace(G))
  8. constraints = []
  9. # add distance-preserving constraints
  10. for i in range(n_points):
  11. for j in range(n_points):
  12. if nn[i, j] == 1:
  13. constraints.append(
  14. (X[i, i] - 2 * X[i, j] + X[j, j]) -
  15. (G[i, i] - 2 * G[i, j] + G[j, j]) == 0
  16. )
  17. problem = cvx.Problem(objective, constraints)
  18. problem.solve(verbose=True, max_iters=10000000)
  19. print(problem.status)

However, CVXPY tells me the problem is infeasible. If I remove the constraints, the solver says the problem is unbounded, which makes sense because I'm maximizing a convex function of the decision variable. My question, then, is what mistake(s) I am making in my formulation of MVU. Thank you in advance!


I omitted, in the code above, a verification that the adjacency graph is not disconnected. To do that, I simply perform a DFS on the adjacency graph and assert that the number of visited nodes is equal to the number of rows in the data. For n=4 neighbors, this holds.

答案1

得分: 1

以下是您提供的英文内容的中文翻译:

"这实际上是一个非常困难的SDP问题,因为你有许多接近线性依赖关系。即使是最好的实现也会遇到一些数值问题,需要降低公差或使用其他技巧。它真的有点棘手。

关于你的代码:

  • 重新调整你的数据,使得X不在数量级为1e+8。这会使求解器立即在数值上出现问题。例如,除以1e+6。实际上,这个建议适用于任何问题。
  • 添加归一化约束sum(G)=0。如果没有它,它可能会无界,求解器稍后也会出现问题。你会看到在SCS日志输出中出现巨大的值。

通过这些改变,我能够让SCS开始收敛。过了一会儿,它看起来像是:

  1. 8500| 1.01e+00 1.11e-04 6.31e+00 -1.64e+06 5.49e-04 6.15e+02
  2. 8750| 1.01e+00 1.11e-04 6.39e+00 -1.64e+06 5.49e-04 6.33e+02

不过我不会等它完成。如果你希望它终止,可能需要稍微放宽公差。

我建议你使用cvx.MOSEK(声明一下,我在那里工作)作为求解器,但似乎CVXPY对MOSEK的编译过程会遇到一些问题,开始占用大量内存并花费很长时间。所以这里是一个在MOSEK Fusion API中的直接实现(包括重新缩放和归一化约束):

  1. from mosek.fusion import *
  2. import scipy.io
  3. import numpy as np
  4. from sklearn.neighbors import NearestNeighbors
  5. import sys
  6. data = scipy.io.loadmat('teapots.mat')
  7. data = data["Input"][0][0][0]
  8. # 使其变为 (400 x 23028)
  9. data = data.T.astype(np.float64)
  10. n_points = data.shape[0]
  11. # 用连接创建稀疏矩阵 (400 x 400)
  12. nn = NearestNeighbors(n_neighbors=4).fit(data)
  13. nn = nn.kneighbors_graph(data).todense()
  14. nn = np.array(nn)
  15. # 原始数据的内积矩阵
  16. X = data.dot(data.T)/10**6
  17. print(X)
  18. # 投影点的内积矩阵;PSD约束使G既是PSD又是对称的
  19. M = Model()
  20. G = M.variable(Domain.inPSDCone(n_points))
  21. M.constraint(Expr.sum(G), Domain.equalsTo(0))
  22. # 在目标流形中扩展点
  23. M.objective(ObjectiveSense.Maximize, Expr.sum(G.diag()))
  24. # 添加保持距离的约束
  25. for i in range(n_points):
  26. for j in range(n_points):
  27. if nn[i, j] == 1:
  28. M.constraint(Expr.add([G.index([i,i]),G.index([j,j]),Expr.mul(-2,G.index([i,j]))]), Domain.equalsTo(X[i, i] - 2 * X[i, j] + X[j, j]))
  29. # 打印日志
  30. M.setLogHandler(sys.stdout)
  31. # 降低精度
  32. M.setSolverParam("intpntCoTolRelGap", 1e-5)
  33. M.setSolverParam("intpntCoTolPfeas",1e-5)
  34. M.setSolverParam("intpntCoTolDfeas",1e-5)
  35. # 求解
  36. M.solve()
  37. Gresult = G.level().reshape((n_points,n_points))
  38. print(Gresult)

解决方案看起来不错:

  1. Interior-point solution summary
  2. Problem status : PRIMAL_AND_DUAL_FEASIBLE
  3. Solution status : OPTIMAL
  4. Primal. obj: 1.5209452618e+06 nrm: 4e+03 Viol. con: 2e-02 var: 0e+00 barvar: 0e+00
  5. Dual. obj: 1.5209449262e+06 nrm: 8e+03 Viol. con: 0e+00 var: 0e+00 barvar: 1e-04
英文:

This is a very hard type of SDP problem actually because you have a lot of near linear dependencies. Even the best implementations will run into some sort of numerical issues and will require reduced tolerances or other tricks. It really is a bit nasty.

Regarding your code:

  • Rescale your data so that X is not of order 1e+8. It makes the solver immediately go numerically bananas. For example divide by 1e+6. That advice would apply to any problem, actually.
  • Add the normalizing constraint sum(G)=0. Without it it can be unbounded and the solver will also go bananas, just a bit later. You can see huge values appearing in the SCS log output.

With these changes I was able to get SCS to start converging. After a while it looked like

  1. 8500| 1.01e+00 1.11e-04 6.31e+00 -1.64e+06 5.49e-04 6.15e+02
  2. 8750| 1.01e+00 1.11e-04 6.39e+00 -1.64e+06 5.49e-04 6.33e+02

I will not wait for it to finish though. You may need to fiddle with tolerances to relax them if you want it to terminate.

I would advise you to use cvx.MOSEK (where, disclamer, I work) as a solver, but it seems the CVXPY compilation procedure to MOSEK runs into some issues and starts eating a lot of RAM taking forever. So instead here is a direct implementation in MOSEK Fusion API (with the rescaling and normalizing constraint):

  1. from mosek.fusion import *
  2. import scipy.io
  3. import numpy as np
  4. from sklearn.neighbors import NearestNeighbors
  5. import sys
  6. data = scipy.io.loadmat('teapots.mat')
  7. data = data["Input"][0][0][0]
  8. # make it (400 x 23028)
  9. data = data.T.astype(np.float64)
  10. n_points = data.shape[0]
  11. # create sparse matrix (400 x 400) with the connections
  12. nn = NearestNeighbors(n_neighbors=4).fit(data)
  13. nn = nn.kneighbors_graph(data).todense()
  14. nn = np.array(nn)
  15. # inner product matrix of the original data
  16. X = data.dot(data.T)/10**6
  17. print(X)
  18. # inner product matrix of the projected points; PSD constrains G to be both PSD and symmetric
  19. M = Model()
  20. G = M.variable(Domain.inPSDCone(n_points))
  21. M.constraint(Expr.sum(G), Domain.equalsTo(0))
  22. # spread out points in target manifold
  23. M.objective(ObjectiveSense.Maximize, Expr.sum(G.diag()))
  24. # add distance-preserving constraints
  25. for i in range(n_points):
  26. for j in range(n_points):
  27. if nn[i, j] == 1:
  28. M.constraint(Expr.add([G.index([i,i]),G.index([j,j]),Expr.mul(-2,G.index([i,j]))]), Domain.equalsTo(X[i, i] - 2 * X[i, j] + X[j, j]))
  29. # Print log
  30. M.setLogHandler(sys.stdout)
  31. # Reduce accuracy
  32. M.setSolverParam("intpntCoTolRelGap", 1e-5)
  33. M.setSolverParam("intpntCoTolPfeas",1e-5)
  34. M.setSolverParam("intpntCoTolDfeas",1e-5)
  35. #Solve
  36. M.solve()
  37. Gresult = G.level().reshape((n_points,n_points))
  38. print(Gresult)

The solution looks OK:

  1. Interior-point solution summary
  2. Problem status : PRIMAL_AND_DUAL_FEASIBLE
  3. Solution status : OPTIMAL
  4. Primal. obj: 1.5209452618e+06 nrm: 4e+03 Viol. con: 2e-02 var: 0e+00 barvar: 0e+00
  5. Dual. obj: 1.5209449262e+06 nrm: 8e+03 Viol. con: 0e+00 var: 0e+00 barvar: 1e-04

huangapple
  • 本文由 发表于 2023年5月10日 21:09:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/76218830.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定