反转或可逆的ndimage.map_coordinates映射,基于平面单应性。

huangapple go评论50阅读模式
英文:

reverse or invertible ndimage.map_coordinates mapping based on planar homography

问题

假设我有一幅图像我想将其作为系统正向模型的一部分进行扭曲在逆向模型中我需要能够撤销这个扭曲考虑以下内容

```python
import numpy as np
from scipy.ndimage import map_coordinates
from matplotlib import pyplot as plt

# 定义函数make_rotation_matrix和其他变量(未翻译部分)

# 在图像中绘制一个正方形
sfe = np.zeros((128,128), dtype=float)
c=64
w=32
sfe[c-w:c+w,c-w:c+w] = 1

# 计算坐标,将其平移到原点,旋转,再平移回去
# 定义变换矩阵和操作(未翻译部分)

# 执行变换操作
# 定义points和M(未翻译部分)

# 使用map_coordinates函数进行图像映射
mapped = map_coordinates(sfe, (yout,xout))
unmapped = map_coordinates(mapped, (yout2,xout2))
neighbors = np.hstack((sfe, mapped, unmapped))
plt.imshow(neighbors)

如果我执行一个时钟旋转而不是一个平面外旋转,我得到了我期望的行为。

我了解,按照构造,我假设图像是一个鸟瞰图或平面单应性,这是可以的。我缺少什么?一些与图像扭曲相关的谷歌搜索结果会得到晦涩难懂的MATLAB答案,但我不明白“空间参考”是做什么的。

编辑:以下是一个例子,其逆单应性并不真正撤销使用map_coordinates的变换:

H = np.array([[ 0.063, -0.011,  0.761],
       [ 0.011,  0.063, -0.639],
       [-0.   , -0.   ,  0.063]])
英文:

Suppose I have an image that I want to warp as part of a forward model of a system. In the reverse model, I need to be able to undo the warp. Consider the following:

import numpy as np

from scipy.ndimage import map_coordinates

from matplotlib import pyplot as plt

def make_rotation_matrix(abg, radians=False):
    ABG = np.zeros(3)
    ABG[:len(abg)] = abg
    abg = ABG
    if not radians:
        abg = np.radians(abg)

    alpha, beta, gamma = abg
    cos1 = np.cos(alpha)
    cos2 = np.cos(beta)
    cos3 = np.cos(gamma)
    sin1 = np.sin(alpha)
    sin2 = np.sin(beta)
    sin3 = np.sin(gamma)
    Rx = np.asarray([
        [1,    0,  0   ],  # NOQA
        [0, cos1, -sin1],
        [0, sin1,  cos1]
    ])
    Ry = np.asarray([
        [cos2,  0, sin2],
        [    0, 1,    0],  # NOQA
        [-sin2, 0, cos2],
    ])
    Rz = np.asarray([
        [cos3, -sin3, 0],
        [sin3,  cos3, 0],
        [0,        0, 1],
    ])
    m = Rz@Ry@Rx
    return m


# draw a square in an image
sfe = np.zeros((128,128), dtype=float)
c=64
w=32
sfe[c-w:c+w,c-w:c+w] = 1

# compute the coordinates, translate to the origin, rotate, translate back
xin = np.arange(128)
yin = np.arange(128)
xin, yin = np.meshgrid(xin,yin)

rot = make_rotation_matrix((0,45,0))

ox, oy = 127/2, 127/2
tin = np.eye(4)
tin[0,-1] = -ox
tin[1,-1] = -oy

tout = np.eye(4)
tout[0,-1] = ox
tout[1,-1] = oy

rot2 = np.zeros((4,4), dtype=float)
rot2[:3,:3] = rot
rot2[-1,-1] = 1

M = tout@(rot2@tin)
Mi = np.linalg.inv(M)

points = np.zeros((xin.size, 4), dtype=float)
points[:,0] = xin.ravel()
points[:,1] = yin.ravel()
points[:,2] = 0  # z=0
points[:,3] = 1 # lambda coordinate for homography

out = np.dot(Mi, points.T)

xout = out[0].reshape(xin.shape)
yout = out[1].reshape(yin.shape)
zout = out[2].reshape(xin.shape)
hout = out[3].reshape(xin.shape)

# do I need to do something differently here?
points2 = points.copy()
out2 = np.dot(M, points2.T)

xout2 = out2[0].reshape(xin.shape)
yout2 = out2[1].reshape(yin.shape)
zout2 = out2[2].reshape(xin.shape)
hout2 = out2[3].reshape(xin.shape)

mapped = map_coordinates(sfe, (yout,xout))
unmapped = map_coordinates(mapped, (yout2,xout2))
neighbors = np.hstack((sfe, mapped, unmapped))
plt.imshow(neighbors)

反转或可逆的ndimage.map_coordinates映射,基于平面单应性。

If I perform a clocking rotation instead of an out of plane rotation, I get the behavior I expect:

反转或可逆的ndimage.map_coordinates映射,基于平面单应性。

I understand that by construction I am assuming the image is a birds-eye view or a planar homography, which is OK. What am I missing? Some google related to image warping finds cryptic matlab answers, but I do not understand what the "spatial referencing" is doing.

Edit: An example homography whose inverse does not actually undo the transformation with map_coordinates:

H = np.array([[ 0.063, -0.011,  0.761],
[ 0.011,  0.063, -0.639],
[-0.   , -0.   ,  0.063]])

Simply plotting a square with plot.scatter, it does exactly invert.

答案1

得分: 2

I've refactored your code to see what's going on.

For one, map_coordinates() does a "pull," i.e., it pulls source pixels into a result grid using indices you supply. Those indices need to be generated using a regular grid (for the result) and the inverse of the transformation (to the source frame). That's why your square appears to expand rather than contract.

Then... dropping Z does matter, and where you drop "Z" (inputs/outputs to the 4x4 transformation), especially when inverting a transformation.

Given an out-of-plane rotation, say around Y, you get something like this:

The inverse of that is:

If you drop Z in both (and apply that to 2D data, which you have), you now get a pair of transforms that both contract the image:

(In your case, that causes expansion each time because of map_coordinates() and its "pull" operation)

Contraction is the appearance of rotation of in-plane points, but it's not rotation. Dropping Z does not maintain inv(M) @ M == I.

The rotated points, having been rotated out of their plane, have non-zero Z, which is important when you want to rotate them further (e.g., rotate them back). Dropping Z means you no longer have that information. You have to assume their positions in space, and the transformation in 2D has to contract or stretch instead, depending on what plane you assume they come from and where they need to go.

You have to drop Z in M (4x4) first, then invert the resulting 3x3 matrix. Now you have the correct inverse, which expands the image, resulting in an identity transform.

Now here's some code:

def translate4(tx=0, ty=0, tz=0):
    T = np.eye(4)
    T[0:3, 3] = (tx, ty, tz)
    return T

def rotate4(rx=0, ry=0, rz=0):
    R = np.eye(4)
    R[0:3, 0:3] = make_rotation_matrix((rx, ry, rz))
    return R

def dropZ(T4):
    "assumes that XYZW inputs have Z=0 and that the result's Z will be ignored"
    tmp = T4[[0,1,3], :]
    tmp = tmp[:, [0,1,3]]
    return tmp

input data. don't mind the use of OpenCV. I wasn't in the mood to come up with random() calls to give the square some texture.

im_source = cv.imread(cv.samples.findFile("lena.jpg"), cv.IMREAD_GRAYSCALE)
height, width = im_source.shape[:2]

transformation: rotate around center

cx, cy = (width-1)/2, (height-1)/2
Tin = translate4(-cx, -cy)
Tout = translate4(+cx, +cy)
R = rotate4(ry=45, rz=30) # with a little in-plane rotation
M = Tout @ R @ Tin

M3 = dropZ(M)
Mi3 = inv(M3)
#print(M3)
#print(Mi3)

coordinates grid

xin = np.arange(width)
yin = np.arange(height)
xin, yin = np.meshgrid(xin, yin)
zin = np.zeros_like(xin)
win = np.ones_like(xin)

points4 = np.vstack((xin.flatten(), yin.flatten(), zin.flatten(), win.flatten()))
print(points4)

points3 = np.vstack((xin.flatten(), yin.flatten(), win.flatten()))
print(points3)

always: transform inverted because map_coords is backwards/pull

can't invert right at the map_coords() call because we've already warped the grid by then

points_warped = inv(M3) @ points3 # apply M3 to identity grid, for input image
print("warped:")
print(points_warped)

points_identity = M3 @ points_warped # apply inv(M3) to warped grid, giving identity grid

it's equal to M3 @ inv(M3) @ points3

which is I (identity) @ points3

print("unwarped: (identity grid)")
print(points_identity)

points_unwarping = M3 @ points3 # apply inv(M3) to identity grid, suitable for unwarping warped image

map_coordinates() wants indices, so Y,X or I,J

coords_warped = points_warped.reshape((3, height, width))[[1,0]]
coords_identity = points_identity.reshape((3, height, width))[[1,0]]
coords_unwarping = points_unwarping.reshape((3, height, width))[[1,0]]

im_warped = map_coordinates(im_source, coords_warped)
im_identity = map_coordinates(im_source, coords_identity)
im_unwarped = map_coordinates(im_warped, coords_unwarping)

neighbors = np.hstack((im_source, im_warped, im_identity, im_unwarped))
#neighbors = np.hstack((im1, im2, im3))
plt.figure(figsize=(20,20))
plt.imshow(neighbors, cmap='gray')
plt.show()

Fortunately, this is all linear (not non-linear), and inv(M) @ M == I == M @ inv(M).

英文:

I've refactored your code to see what's going on.

For one, map_coordinates() does a "pull", i.e. it pulls source pixels into a result grid, using indices you supply. Those indices need to be generated using a regular grid (for the result) and the inverse of the transformation (to the source frame). That is why your square appears to expand rather than contract.

Then... dropping Z does matter, and where you drop "Z" (inputs/outputs to the 4x4 transformation), especially when inverting a transformation.

Given an out-of-plane rotation, say around Y, you get something like this:

[[ 0.70711  0.       0.70711  0.     ]
 [ 0.       1.       0.       0.     ]
 [-0.70711  0.       0.70711  0.     ]
 [ 0.       0.       0.       1.     ]]

The inverse of that is:

[[ 0.70711  0.      -0.70711  0.     ]
 [ 0.       1.       0.       0.     ]
 [ 0.70711  0.       0.70711  0.     ]
 [ 0.       0.       0.       1.     ]]

If you drop Z in both (and apply that to 2D data, which you have), you now get a pair of transforms that both contract the image:

[[0.70711 0.      0.     ]
 [0.      1.      0.     ]
 [0.      0.      1.     ]]

[[0.70711 0.      0.     ]
 [0.      1.      0.     ]
 [0.      0.      1.     ]]

(In your case, that causes expansion each time because of map_coordinates() and its "pull" operation)

Contraction is the appearance of rotation of in-plane points, but it's not rotation. Dropping Z does not maintain inv(M) @ M == I.

The rotated points, having been rotated out of their plane, have non-zero Z, which is important when you want to rotate them further (e.g. rotate them back). Dropping Z means you no longer have that information. You have to assume their positions in space, and the transformation in 2D has to contract or stretch instead, depending on what plane you assume they come from and where they need to go.

You have to drop Z in M (4x4) first, then invert the resulting 3x3 matrix. Now you have the correct inverse, which expands the image, resulting in an identity transform.

[[0.70711 0.      0.     ]
 [0.      1.      0.     ]
 [0.      0.      1.     ]]

[[1.41421 0.      0.     ]
 [0.      1.      0.     ]
 [0.      0.      1.     ]]

反转或可逆的ndimage.map_coordinates映射,基于平面单应性。

Now here's some code:

def translate4(tx=0, ty=0, tz=0):
    T = np.eye(4)
    T[0:3, 3] = (tx, ty, tz)
    return T

def rotate4(rx=0, ry=0, rz=0):
    R = np.eye(4)
    R[0:3, 0:3] = make_rotation_matrix((rx, ry, rz))
    return R

def dropZ(T4):
    "assumes that XYZW inputs have Z=0 and that the result's Z will be ignored"
    tmp = T4[[0,1,3], :]
    tmp = tmp[:, [0,1,3]]
    return tmp
# input data. don't mind the use of OpenCV. I wasn't in the mood to come up with random() calls to give the square some texture.
im_source = cv.imread(cv.samples.findFile("lena.jpg"), cv.IMREAD_GRAYSCALE)
height, width = im_source.shape[:2]
# transformation: rotate around center
cx, cy = (width-1)/2, (height-1)/2
Tin = translate4(-cx, -cy)
Tout = translate4(+cx, +cy)
R = rotate4(ry=45, rz=30) # with a little in-plane rotation
M = Tout @ R @ Tin
M3 = dropZ(M)
Mi3 = inv(M3)
#print(M3)
#print(Mi3)
# coordinates grid
xin = np.arange(width)
yin = np.arange(height)
xin, yin = np.meshgrid(xin, yin)
zin = np.zeros_like(xin)
win = np.ones_like(xin)

points4 = np.vstack((xin.flatten(), yin.flatten(), zin.flatten(), win.flatten()))
print(points4)

points3 = np.vstack((xin.flatten(), yin.flatten(), win.flatten()))
print(points3)
# always: transform inverted because map_coords is backwards/pull
# can't invert right at the map_coords() call because we've already warped the grid by then

points_warped = inv(M3) @ points3 # apply M3 to identity grid, for input image
print("warped:")
print(points_warped)

points_identity = M3 @ points_warped # apply inv(M3) to warped grid, giving identity grid
# it's equal to M3 @ inv(M3) @ points3
# which is I (identity) @ points3
print("unwarped: (identity grid)")
print(points_identity)

points_unwarping = M3 @ points3 # apply inv(M3) to identity grid, suitable for unwarping *warped* image

# map_coordinates() wants indices, so Y,X or I,J
coords_warped = points_warped.reshape((3, height, width))[[1,0]]
coords_identity = points_identity.reshape((3, height, width))[[1,0]]
coords_unwarping = points_unwarping.reshape((3, height, width))[[1,0]]

im_warped = map_coordinates(im_source, coords_warped)
im_identity = map_coordinates(im_source, coords_identity)
im_unwarped = map_coordinates(im_warped, coords_unwarping)

neighbors = np.hstack((im_source, im_warped, im_identity, im_unwarped))
#neighbors = np.hstack((im1, im2, im3))
plt.figure(figsize=(20,20))
plt.imshow(neighbors, cmap='gray')
plt.show()

Fortunately, this is all linear (not non-linear), and inv(M) @ M == I == M @ inv(M).

huangapple
  • 本文由 发表于 2023年4月17日 12:44:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/76031781.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定