2023年4月17日 12:44:34go评论80阅读模式

英文:

reverse or invertible ndimage.map_coordinates mapping based on planar homography

问题

假设我有一幅图像，我想将其作为系统正向模型的一部分进行扭曲。在逆向模型中，我需要能够撤销这个扭曲。考虑以下内容：
```python
import numpy as np
from scipy.ndimage import map_coordinates
from matplotlib import pyplot as plt
# 定义函数make_rotation_matrix和其他变量（未翻译部分）
# 在图像中绘制一个正方形
sfe = np.zeros((128,128), dtype=float)
c=64
w=32
sfe[c-w:c+w,c-w:c+w] = 1
# 计算坐标，将其平移到原点，旋转，再平移回去
# 定义变换矩阵和操作（未翻译部分）
# 执行变换操作
# 定义points和M（未翻译部分）
# 使用map_coordinates函数进行图像映射
mapped = map_coordinates(sfe, (yout,xout))
unmapped = map_coordinates(mapped, (yout2,xout2))
neighbors = np.hstack((sfe, mapped, unmapped))
plt.imshow(neighbors)

如果我执行一个时钟旋转而不是一个平面外旋转，我得到了我期望的行为。

我了解，按照构造，我假设图像是一个鸟瞰图或平面单应性，这是可以的。我缺少什么？一些与图像扭曲相关的谷歌搜索结果会得到晦涩难懂的MATLAB答案，但我不明白“空间参考”是做什么的。

编辑：以下是一个例子，其逆单应性并不真正撤销使用map_coordinates的变换：

H = np.array([[ 0.063, -0.011,  0.761],
       [ 0.011,  0.063, -0.639],
       [-0.   , -0.   ,  0.063]])

英文:

Suppose I have an image that I want to warp as part of a forward model of a system. In the reverse model, I need to be able to undo the warp. Consider the following:

import numpy as np
from scipy.ndimage import map_coordinates
from matplotlib import pyplot as plt
def make_rotation_matrix(abg, radians=False):
    ABG = np.zeros(3)
    ABG[:len(abg)] = abg
    abg = ABG
    if not radians:
        abg = np.radians(abg)
    alpha, beta, gamma = abg
    cos1 = np.cos(alpha)
    cos2 = np.cos(beta)
    cos3 = np.cos(gamma)
    sin1 = np.sin(alpha)
    sin2 = np.sin(beta)
    sin3 = np.sin(gamma)
    Rx = np.asarray([
        [1,    0,  0   ],  # NOQA
        [0, cos1, -sin1],
        [0, sin1,  cos1]
    ])
    Ry = np.asarray([
        [cos2,  0, sin2],
        [    0, 1,    0],  # NOQA
        [-sin2, 0, cos2],
    ])
    Rz = np.asarray([
        [cos3, -sin3, 0],
        [sin3,  cos3, 0],
        [0,        0, 1],
    ])
    m = Rz@Ry@Rx
    return m
# draw a square in an image
sfe = np.zeros((128,128), dtype=float)
c=64
w=32
sfe[c-w:c+w,c-w:c+w] = 1
# compute the coordinates, translate to the origin, rotate, translate back
xin = np.arange(128)
yin = np.arange(128)
xin, yin = np.meshgrid(xin,yin)
rot = make_rotation_matrix((0,45,0))
ox, oy = 127/2, 127/2
tin = np.eye(4)
tin[0,-1] = -ox
tin[1,-1] = -oy
tout = np.eye(4)
tout[0,-1] = ox
tout[1,-1] = oy
rot2 = np.zeros((4,4), dtype=float)
rot2[:3,:3] = rot
rot2[-1,-1] = 1
M = tout@(rot2@tin)
Mi = np.linalg.inv(M)
points = np.zeros((xin.size, 4), dtype=float)
points[:,0] = xin.ravel()
points[:,1] = yin.ravel()
points[:,2] = 0  # z=0
points[:,3] = 1 # lambda coordinate for homography
out = np.dot(Mi, points.T)
xout = out[0].reshape(xin.shape)
yout = out[1].reshape(yin.shape)
zout = out[2].reshape(xin.shape)
hout = out[3].reshape(xin.shape)
# do I need to do something differently here?
points2 = points.copy()
out2 = np.dot(M, points2.T)
xout2 = out2[0].reshape(xin.shape)
yout2 = out2[1].reshape(yin.shape)
zout2 = out2[2].reshape(xin.shape)
hout2 = out2[3].reshape(xin.shape)
mapped = map_coordinates(sfe, (yout,xout))
unmapped = map_coordinates(mapped, (yout2,xout2))
neighbors = np.hstack((sfe, mapped, unmapped))
plt.imshow(neighbors)

If I perform a clocking rotation instead of an out of plane rotation, I get the behavior I expect:

I understand that by construction I am assuming the image is a birds-eye view or a planar homography, which is OK. What am I missing? Some google related to image warping finds cryptic matlab answers, but I do not understand what the "spatial referencing" is doing.

Edit: An example homography whose inverse does not actually undo the transformation with map_coordinates:

H = np.array([[ 0.063, -0.011,  0.761],
[ 0.011,  0.063, -0.639],
[-0.   , -0.   ,  0.063]])

Simply plotting a square with plot.scatter, it does exactly invert.

答案1

得分: 2

I've refactored your code to see what's going on.

For one, map_coordinates() does a "pull," i.e., it pulls source pixels into a result grid using indices you supply. Those indices need to be generated using a regular grid (for the result) and the inverse of the transformation (to the source frame). That's why your square appears to expand rather than contract.

Then... dropping Z does matter, and where you drop "Z" (inputs/outputs to the 4x4 transformation), especially when inverting a transformation.

Given an out-of-plane rotation, say around Y, you get something like this:

The inverse of that is:

If you drop Z in both (and apply that to 2D data, which you have), you now get a pair of transforms that both contract the image:

(In your case, that causes expansion each time because of map_coordinates() and its "pull" operation)

Contraction is the appearance of rotation of in-plane points, but it's not rotation. Dropping Z does not maintain inv(M) @ M == I.

The rotated points, having been rotated out of their plane, have non-zero Z, which is important when you want to rotate them further (e.g., rotate them back). Dropping Z means you no longer have that information. You have to assume their positions in space, and the transformation in 2D has to contract or stretch instead, depending on what plane you assume they come from and where they need to go.

You have to drop Z in M (4x4) first, then invert the resulting 3x3 matrix. Now you have the correct inverse, which expands the image, resulting in an identity transform.

Now here's some code:

def translate4(tx=0, ty=0, tz=0):
    T = np.eye(4)
    T[0:3, 3] = (tx, ty, tz)
    return T
def rotate4(rx=0, ry=0, rz=0):
    R = np.eye(4)
    R[0:3, 0:3] = make_rotation_matrix((rx, ry, rz))
    return R
def dropZ(T4):
    "assumes that XYZW inputs have Z=0 and that the result's Z will be ignored"
    tmp = T4[[0,1,3], :]
    tmp = tmp[:, [0,1,3]]
    return tmp

input data. don't mind the use of OpenCV. I wasn't in the mood to come up with random() calls to give the square some texture.

im_source = cv.imread(cv.samples.findFile("lena.jpg"), cv.IMREAD_GRAYSCALE)
height, width = im_source.shape[:2]

transformation: rotate around center

cx, cy = (width-1)/2, (height-1)/2
Tin = translate4(-cx, -cy)
Tout = translate4(+cx, +cy)
R = rotate4(ry=45, rz=30) # with a little in-plane rotation
M = Tout @ R @ Tin

M3 = dropZ(M)
Mi3 = inv(M3)
#print(M3)
#print(Mi3)

coordinates grid

xin = np.arange(width)
yin = np.arange(height)
xin, yin = np.meshgrid(xin, yin)
zin = np.zeros_like(xin)
win = np.ones_like(xin)

points4 = np.vstack((xin.flatten(), yin.flatten(), zin.flatten(), win.flatten()))
print(points4)

points3 = np.vstack((xin.flatten(), yin.flatten(), win.flatten()))
print(points3)

always: transform inverted because map_coords is backwards/pull

can't invert right at the map_coords() call because we've already warped the grid by then

points_warped = inv(M3) @ points3 # apply M3 to identity grid, for input image
print("warped:")
print(points_warped)

points_identity = M3 @ points_warped # apply inv(M3) to warped grid, giving identity grid

it's equal to M3 @ inv(M3) @ points3

which is I (identity) @ points3

print("unwarped: (identity grid)")
print(points_identity)

points_unwarping = M3 @ points3 # apply inv(M3) to identity grid, suitable for unwarping warped image

map_coordinates() wants indices, so Y,X or I,J

coords_warped = points_warped.reshape((3, height, width))[[1,0]]
coords_identity = points_identity.reshape((3, height, width))[[1,0]]
coords_unwarping = points_unwarping.reshape((3, height, width))[[1,0]]

im_warped = map_coordinates(im_source, coords_warped)
im_identity = map_coordinates(im_source, coords_identity)
im_unwarped = map_coordinates(im_warped, coords_unwarping)

neighbors = np.hstack((im_source, im_warped, im_identity, im_unwarped))
#neighbors = np.hstack((im1, im2, im3))
plt.figure(figsize=(20,20))
plt.imshow(neighbors, cmap='gray')
plt.show()

Fortunately, this is all linear (not non-linear), and inv(M) @ M == I == M @ inv(M).

英文:

I've refactored your code to see what's going on.

For one, map_coordinates() does a "pull", i.e. it pulls source pixels into a result grid, using indices you supply. Those indices need to be generated using a regular grid (for the result) and the inverse of the transformation (to the source frame). That is why your square appears to expand rather than contract.

Then... dropping Z does matter, and where you drop "Z" (inputs/outputs to the 4x4 transformation), especially when inverting a transformation.

Given an out-of-plane rotation, say around Y, you get something like this:

[[ 0.70711  0.       0.70711  0.     ]
 [ 0.       1.       0.       0.     ]
 [-0.70711  0.       0.70711  0.     ]
 [ 0.       0.       0.       1.     ]]

The inverse of that is:

[[ 0.70711  0.      -0.70711  0.     ]
 [ 0.       1.       0.       0.     ]
 [ 0.70711  0.       0.70711  0.     ]
 [ 0.       0.       0.       1.     ]]

If you drop Z in both (and apply that to 2D data, which you have), you now get a pair of transforms that both contract the image:

[[0.70711 0.      0.     ]
 [0.      1.      0.     ]
 [0.      0.      1.     ]]
[[0.70711 0.      0.     ]
 [0.      1.      0.     ]
 [0.      0.      1.     ]]

(In your case, that causes expansion each time because of map_coordinates() and its "pull" operation)

Contraction is the appearance of rotation of in-plane points, but it's not rotation. Dropping Z does not maintain inv(M) @ M == I.

The rotated points, having been rotated out of their plane, have non-zero Z, which is important when you want to rotate them further (e.g. rotate them back). Dropping Z means you no longer have that information. You have to assume their positions in space, and the transformation in 2D has to contract or stretch instead, depending on what plane you assume they come from and where they need to go.

You have to drop Z in M (4x4) first, then invert the resulting 3x3 matrix. Now you have the correct inverse, which expands the image, resulting in an identity transform.

[[0.70711 0.      0.     ]
 [0.      1.      0.     ]
 [0.      0.      1.     ]]
[[1.41421 0.      0.     ]
 [0.      1.      0.     ]
 [0.      0.      1.     ]]

Now here's some code:

def translate4(tx=0, ty=0, tz=0):
    T = np.eye(4)
    T[0:3, 3] = (tx, ty, tz)
    return T
def rotate4(rx=0, ry=0, rz=0):
    R = np.eye(4)
    R[0:3, 0:3] = make_rotation_matrix((rx, ry, rz))
    return R
def dropZ(T4):
    &quot;assumes that XYZW inputs have Z=0 and that the result&#39;s Z will be ignored&quot;
    tmp = T4[[0,1,3], :]
    tmp = tmp[:, [0,1,3]]
    return tmp

# input data. don&#39;t mind the use of OpenCV. I wasn&#39;t in the mood to come up with random() calls to give the square some texture.
im_source = cv.imread(cv.samples.findFile(&quot;lena.jpg&quot;), cv.IMREAD_GRAYSCALE)
height, width = im_source.shape[:2]

# transformation: rotate around center
cx, cy = (width-1)/2, (height-1)/2
Tin = translate4(-cx, -cy)
Tout = translate4(+cx, +cy)
R = rotate4(ry=45, rz=30) # with a little in-plane rotation
M = Tout @ R @ Tin

M3 = dropZ(M)
Mi3 = inv(M3)
#print(M3)
#print(Mi3)

# coordinates grid
xin = np.arange(width)
yin = np.arange(height)
xin, yin = np.meshgrid(xin, yin)
zin = np.zeros_like(xin)
win = np.ones_like(xin)
points4 = np.vstack((xin.flatten(), yin.flatten(), zin.flatten(), win.flatten()))
print(points4)
points3 = np.vstack((xin.flatten(), yin.flatten(), win.flatten()))
print(points3)

# always: transform inverted because map_coords is backwards/pull
# can&#39;t invert right at the map_coords() call because we&#39;ve already warped the grid by then
points_warped = inv(M3) @ points3 # apply M3 to identity grid, for input image
print(&quot;warped:&quot;)
print(points_warped)
points_identity = M3 @ points_warped # apply inv(M3) to warped grid, giving identity grid
# it&#39;s equal to M3 @ inv(M3) @ points3
# which is I (identity) @ points3
print(&quot;unwarped: (identity grid)&quot;)
print(points_identity)
points_unwarping = M3 @ points3 # apply inv(M3) to identity grid, suitable for unwarping *warped* image
# map_coordinates() wants indices, so Y,X or I,J
coords_warped = points_warped.reshape((3, height, width))[[1,0]]
coords_identity = points_identity.reshape((3, height, width))[[1,0]]
coords_unwarping = points_unwarping.reshape((3, height, width))[[1,0]]
im_warped = map_coordinates(im_source, coords_warped)
im_identity = map_coordinates(im_source, coords_identity)
im_unwarped = map_coordinates(im_warped, coords_unwarping)
neighbors = np.hstack((im_source, im_warped, im_identity, im_unwarped))
#neighbors = np.hstack((im1, im2, im3))
plt.figure(figsize=(20,20))
plt.imshow(neighbors, cmap=&#39;gray&#39;)
plt.show()

Fortunately, this is all linear (not non-linear), and inv(M) @ M == I == M @ inv(M).

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

反转或可逆的ndimage.map_coordinates映射，基于平面单应性。

问题

答案1

input data. don't mind the use of OpenCV. I wasn't in the mood to come up with random() calls to give the square some texture.

transformation: rotate around center

coordinates grid

always: transform inverted because map_coords is backwards/pull

can't invert right at the map_coords() call because we've already warped the grid by then

it's equal to M3 @ inv(M3) @ points3

which is I (identity) @ points3

map_coordinates() wants indices, so Y,X or I,J

如何比较两个柏林噪声图像？

简单图像索引的起点是什么？

如何从我的数组数据集创建一个16位灰度图像

使用Go语言处理水印图像

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。