英文:
reverse or invertible ndimage.map_coordinates mapping based on planar homography
问题
假设我有一幅图像,我想将其作为系统正向模型的一部分进行扭曲。在逆向模型中,我需要能够撤销这个扭曲。考虑以下内容:
```python
import numpy as np
from scipy.ndimage import map_coordinates
from matplotlib import pyplot as plt
# 定义函数make_rotation_matrix和其他变量(未翻译部分)
# 在图像中绘制一个正方形
sfe = np.zeros((128,128), dtype=float)
c=64
w=32
sfe[c-w:c+w,c-w:c+w] = 1
# 计算坐标,将其平移到原点,旋转,再平移回去
# 定义变换矩阵和操作(未翻译部分)
# 执行变换操作
# 定义points和M(未翻译部分)
# 使用map_coordinates函数进行图像映射
mapped = map_coordinates(sfe, (yout,xout))
unmapped = map_coordinates(mapped, (yout2,xout2))
neighbors = np.hstack((sfe, mapped, unmapped))
plt.imshow(neighbors)
如果我执行一个时钟旋转而不是一个平面外旋转,我得到了我期望的行为。
我了解,按照构造,我假设图像是一个鸟瞰图或平面单应性,这是可以的。我缺少什么?一些与图像扭曲相关的谷歌搜索结果会得到晦涩难懂的MATLAB答案,但我不明白“空间参考”是做什么的。
编辑:以下是一个例子,其逆单应性并不真正撤销使用map_coordinates的变换:
H = np.array([[ 0.063, -0.011, 0.761],
[ 0.011, 0.063, -0.639],
[-0. , -0. , 0.063]])
英文:
Suppose I have an image that I want to warp as part of a forward model of a system. In the reverse model, I need to be able to undo the warp. Consider the following:
import numpy as np
from scipy.ndimage import map_coordinates
from matplotlib import pyplot as plt
def make_rotation_matrix(abg, radians=False):
ABG = np.zeros(3)
ABG[:len(abg)] = abg
abg = ABG
if not radians:
abg = np.radians(abg)
alpha, beta, gamma = abg
cos1 = np.cos(alpha)
cos2 = np.cos(beta)
cos3 = np.cos(gamma)
sin1 = np.sin(alpha)
sin2 = np.sin(beta)
sin3 = np.sin(gamma)
Rx = np.asarray([
[1, 0, 0 ], # NOQA
[0, cos1, -sin1],
[0, sin1, cos1]
])
Ry = np.asarray([
[cos2, 0, sin2],
[ 0, 1, 0], # NOQA
[-sin2, 0, cos2],
])
Rz = np.asarray([
[cos3, -sin3, 0],
[sin3, cos3, 0],
[0, 0, 1],
])
m = Rz@Ry@Rx
return m
# draw a square in an image
sfe = np.zeros((128,128), dtype=float)
c=64
w=32
sfe[c-w:c+w,c-w:c+w] = 1
# compute the coordinates, translate to the origin, rotate, translate back
xin = np.arange(128)
yin = np.arange(128)
xin, yin = np.meshgrid(xin,yin)
rot = make_rotation_matrix((0,45,0))
ox, oy = 127/2, 127/2
tin = np.eye(4)
tin[0,-1] = -ox
tin[1,-1] = -oy
tout = np.eye(4)
tout[0,-1] = ox
tout[1,-1] = oy
rot2 = np.zeros((4,4), dtype=float)
rot2[:3,:3] = rot
rot2[-1,-1] = 1
M = tout@(rot2@tin)
Mi = np.linalg.inv(M)
points = np.zeros((xin.size, 4), dtype=float)
points[:,0] = xin.ravel()
points[:,1] = yin.ravel()
points[:,2] = 0 # z=0
points[:,3] = 1 # lambda coordinate for homography
out = np.dot(Mi, points.T)
xout = out[0].reshape(xin.shape)
yout = out[1].reshape(yin.shape)
zout = out[2].reshape(xin.shape)
hout = out[3].reshape(xin.shape)
# do I need to do something differently here?
points2 = points.copy()
out2 = np.dot(M, points2.T)
xout2 = out2[0].reshape(xin.shape)
yout2 = out2[1].reshape(yin.shape)
zout2 = out2[2].reshape(xin.shape)
hout2 = out2[3].reshape(xin.shape)
mapped = map_coordinates(sfe, (yout,xout))
unmapped = map_coordinates(mapped, (yout2,xout2))
neighbors = np.hstack((sfe, mapped, unmapped))
plt.imshow(neighbors)
If I perform a clocking rotation instead of an out of plane rotation, I get the behavior I expect:
I understand that by construction I am assuming the image is a birds-eye view or a planar homography, which is OK. What am I missing? Some google related to image warping finds cryptic matlab answers, but I do not understand what the "spatial referencing" is doing.
Edit: An example homography whose inverse does not actually undo the transformation with map_coordinates:
H = np.array([[ 0.063, -0.011, 0.761],
[ 0.011, 0.063, -0.639],
[-0. , -0. , 0.063]])
Simply plotting a square with plot.scatter, it does exactly invert.
答案1
得分: 2
I've refactored your code to see what's going on.
For one, map_coordinates()
does a "pull," i.e., it pulls source pixels into a result grid using indices you supply. Those indices need to be generated using a regular grid (for the result) and the inverse of the transformation (to the source frame). That's why your square appears to expand rather than contract.
Then... dropping Z does matter, and where you drop "Z" (inputs/outputs to the 4x4 transformation), especially when inverting a transformation.
Given an out-of-plane rotation, say around Y, you get something like this:
The inverse of that is:
If you drop Z in both (and apply that to 2D data, which you have), you now get a pair of transforms that both contract the image:
(In your case, that causes expansion each time because of map_coordinates()
and its "pull" operation)
Contraction is the appearance of rotation of in-plane points, but it's not rotation. Dropping Z does not maintain inv(M) @ M == I
.
The rotated points, having been rotated out of their plane, have non-zero Z, which is important when you want to rotate them further (e.g., rotate them back). Dropping Z means you no longer have that information. You have to assume their positions in space, and the transformation in 2D has to contract or stretch instead, depending on what plane you assume they come from and where they need to go.
You have to drop Z in M (4x4) first, then invert the resulting 3x3 matrix. Now you have the correct inverse, which expands the image, resulting in an identity transform.
Now here's some code:
def translate4(tx=0, ty=0, tz=0):
T = np.eye(4)
T[0:3, 3] = (tx, ty, tz)
return T
def rotate4(rx=0, ry=0, rz=0):
R = np.eye(4)
R[0:3, 0:3] = make_rotation_matrix((rx, ry, rz))
return R
def dropZ(T4):
"assumes that XYZW inputs have Z=0 and that the result's Z will be ignored"
tmp = T4[[0,1,3], :]
tmp = tmp[:, [0,1,3]]
return tmp
input data. don't mind the use of OpenCV. I wasn't in the mood to come up with random() calls to give the square some texture.
im_source = cv.imread(cv.samples.findFile("lena.jpg"), cv.IMREAD_GRAYSCALE)
height, width = im_source.shape[:2]
transformation: rotate around center
cx, cy = (width-1)/2, (height-1)/2
Tin = translate4(-cx, -cy)
Tout = translate4(+cx, +cy)
R = rotate4(ry=45, rz=30) # with a little in-plane rotation
M = Tout @ R @ Tin
M3 = dropZ(M)
Mi3 = inv(M3)
#print(M3)
#print(Mi3)
coordinates grid
xin = np.arange(width)
yin = np.arange(height)
xin, yin = np.meshgrid(xin, yin)
zin = np.zeros_like(xin)
win = np.ones_like(xin)
points4 = np.vstack((xin.flatten(), yin.flatten(), zin.flatten(), win.flatten()))
print(points4)
points3 = np.vstack((xin.flatten(), yin.flatten(), win.flatten()))
print(points3)
always: transform inverted because map_coords is backwards/pull
can't invert right at the map_coords() call because we've already warped the grid by then
points_warped = inv(M3) @ points3 # apply M3 to identity grid, for input image
print("warped:")
print(points_warped)
points_identity = M3 @ points_warped # apply inv(M3) to warped grid, giving identity grid
it's equal to M3 @ inv(M3) @ points3
which is I (identity) @ points3
print("unwarped: (identity grid)")
print(points_identity)
points_unwarping = M3 @ points3 # apply inv(M3) to identity grid, suitable for unwarping warped image
map_coordinates() wants indices, so Y,X or I,J
coords_warped = points_warped.reshape((3, height, width))[[1,0]]
coords_identity = points_identity.reshape((3, height, width))[[1,0]]
coords_unwarping = points_unwarping.reshape((3, height, width))[[1,0]]
im_warped = map_coordinates(im_source, coords_warped)
im_identity = map_coordinates(im_source, coords_identity)
im_unwarped = map_coordinates(im_warped, coords_unwarping)
neighbors = np.hstack((im_source, im_warped, im_identity, im_unwarped))
#neighbors = np.hstack((im1, im2, im3))
plt.figure(figsize=(20,20))
plt.imshow(neighbors, cmap='gray')
plt.show()
Fortunately, this is all linear (not non-linear), and inv(M) @ M == I == M @ inv(M)
.
英文:
I've refactored your code to see what's going on.
For one, map_coordinates()
does a "pull", i.e. it pulls source pixels into a result grid, using indices you supply. Those indices need to be generated using a regular grid (for the result) and the inverse of the transformation (to the source frame). That is why your square appears to expand rather than contract.
Then... dropping Z does matter, and where you drop "Z" (inputs/outputs to the 4x4 transformation), especially when inverting a transformation.
Given an out-of-plane rotation, say around Y, you get something like this:
[[ 0.70711 0. 0.70711 0. ]
[ 0. 1. 0. 0. ]
[-0.70711 0. 0.70711 0. ]
[ 0. 0. 0. 1. ]]
The inverse of that is:
[[ 0.70711 0. -0.70711 0. ]
[ 0. 1. 0. 0. ]
[ 0.70711 0. 0.70711 0. ]
[ 0. 0. 0. 1. ]]
If you drop Z in both (and apply that to 2D data, which you have), you now get a pair of transforms that both contract the image:
[[0.70711 0. 0. ]
[0. 1. 0. ]
[0. 0. 1. ]]
[[0.70711 0. 0. ]
[0. 1. 0. ]
[0. 0. 1. ]]
(In your case, that causes expansion each time because of map_coordinates()
and its "pull" operation)
Contraction is the appearance of rotation of in-plane points, but it's not rotation. Dropping Z does not maintain inv(M) @ M == I
.
The rotated points, having been rotated out of their plane, have non-zero Z, which is important when you want to rotate them further (e.g. rotate them back). Dropping Z means you no longer have that information. You have to assume their positions in space, and the transformation in 2D has to contract or stretch instead, depending on what plane you assume they come from and where they need to go.
You have to drop Z in M (4x4) first, then invert the resulting 3x3 matrix. Now you have the correct inverse, which expands the image, resulting in an identity transform.
[[0.70711 0. 0. ]
[0. 1. 0. ]
[0. 0. 1. ]]
[[1.41421 0. 0. ]
[0. 1. 0. ]
[0. 0. 1. ]]
Now here's some code:
def translate4(tx=0, ty=0, tz=0):
T = np.eye(4)
T[0:3, 3] = (tx, ty, tz)
return T
def rotate4(rx=0, ry=0, rz=0):
R = np.eye(4)
R[0:3, 0:3] = make_rotation_matrix((rx, ry, rz))
return R
def dropZ(T4):
"assumes that XYZW inputs have Z=0 and that the result's Z will be ignored"
tmp = T4[[0,1,3], :]
tmp = tmp[:, [0,1,3]]
return tmp
# input data. don't mind the use of OpenCV. I wasn't in the mood to come up with random() calls to give the square some texture.
im_source = cv.imread(cv.samples.findFile("lena.jpg"), cv.IMREAD_GRAYSCALE)
height, width = im_source.shape[:2]
# transformation: rotate around center
cx, cy = (width-1)/2, (height-1)/2
Tin = translate4(-cx, -cy)
Tout = translate4(+cx, +cy)
R = rotate4(ry=45, rz=30) # with a little in-plane rotation
M = Tout @ R @ Tin
M3 = dropZ(M)
Mi3 = inv(M3)
#print(M3)
#print(Mi3)
# coordinates grid
xin = np.arange(width)
yin = np.arange(height)
xin, yin = np.meshgrid(xin, yin)
zin = np.zeros_like(xin)
win = np.ones_like(xin)
points4 = np.vstack((xin.flatten(), yin.flatten(), zin.flatten(), win.flatten()))
print(points4)
points3 = np.vstack((xin.flatten(), yin.flatten(), win.flatten()))
print(points3)
# always: transform inverted because map_coords is backwards/pull
# can't invert right at the map_coords() call because we've already warped the grid by then
points_warped = inv(M3) @ points3 # apply M3 to identity grid, for input image
print("warped:")
print(points_warped)
points_identity = M3 @ points_warped # apply inv(M3) to warped grid, giving identity grid
# it's equal to M3 @ inv(M3) @ points3
# which is I (identity) @ points3
print("unwarped: (identity grid)")
print(points_identity)
points_unwarping = M3 @ points3 # apply inv(M3) to identity grid, suitable for unwarping *warped* image
# map_coordinates() wants indices, so Y,X or I,J
coords_warped = points_warped.reshape((3, height, width))[[1,0]]
coords_identity = points_identity.reshape((3, height, width))[[1,0]]
coords_unwarping = points_unwarping.reshape((3, height, width))[[1,0]]
im_warped = map_coordinates(im_source, coords_warped)
im_identity = map_coordinates(im_source, coords_identity)
im_unwarped = map_coordinates(im_warped, coords_unwarping)
neighbors = np.hstack((im_source, im_warped, im_identity, im_unwarped))
#neighbors = np.hstack((im1, im2, im3))
plt.figure(figsize=(20,20))
plt.imshow(neighbors, cmap='gray')
plt.show()
Fortunately, this is all linear (not non-linear), and inv(M) @ M == I == M @ inv(M)
.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论