我已经在Pytorch中计算了一个流场。如何使用这个流场生成重新映射的图像?

huangapple go评论56阅读模式
英文:

I have calculated a flow field in Pytorch. How do I generate a remapped image using this?

问题

我已经在pytorch中使用RAFT模型计算了两帧之间的光流。以下是该代码:

        noise_img = noise_img.to(device)
        clean_img = clean_img.to(device)
        return_index = noise_img.size(1) // 2
        aligned_frames = torch.zeros((noise_img.size(0), noise_img.size(1), noise_img.size(2), noise_img.size(3), noise_img.size(4)))
        aligned_frames[:, return_index, :, :, :] = noise_img[:, return_index, :, :, :]

        for idx in range(noise_img.size(1)):
                if not idx == return_index:
                        curr_frame = noise_img[:, idx, :, :, :]
                        ref_frame = noise_img[:, return_index, :, :, :]
                        curr_transf, ref_transf = transforms(curr_frame, ref_frame)
                        curr_flow = mc_model(curr_transf, ref_transf)[-1] # 获取最终的光流预测
                        aligned_frames[:, idx, :, :, :] = align_frames(curr_transf, curr_flow)

在上面的代码中,我通过mc_model(RAFT)传递了两帧以获取光流图。在最后一行,我尝试将当前帧与参考帧对齐。以下是我使用的函数:

def warp_flow(img, flow):
    flow_permute = torch.permute(flow, (0, 2, 3, 1))
    remapped = torch.nn.functional.grid_sample(img, flow_permute)
    return remapped

不幸的是,当将remapped保存为图像时,它不会返回一个连贯的图像。大多数图像为零,有些看起来像明亮的波纹。我似乎在使用curr_flow时缺少了一步,但我不太理解是什么问题。

谢谢。

英文:

I have used the RAFT model in pytorch to calculate optical flow between two frames. Here is the code for that:

    noise_img = noise_img.to(device)
    clean_img = clean_img.to(device)
    return_index = noise_img.size(1) // 2
    aligned_frames = torch.zeros((noise_img.size(0), noise_img.size(1), noise_img.size(2), noise_img.size(3), noise_img.size(4)))
    aligned_frames[:, return_index, :, :, :] = noise_img[:, return_index, :, :, :]

    for idx in range(noise_img.size(1)):
            if not idx == return_index:
                    curr_frame = noise_img[:, idx, :, :, :]
                    ref_frame = noise_img[:, return_index, :, :, :]
                    curr_transf, ref_transf = transforms(curr_frame, ref_frame)
                    curr_flow = mc_model(curr_transf, ref_transf)[-1] # Take the final flow prediction
                    aligned_frames[:, idx, :, :, :] = align_frames(curr_transf, curr_flow)

In the above I am passing two frames through mc_model (RAFT) to return an optical flow map. In the final line I am trying to map the current frame to be aligned with the reference frame. Below is the function I use:

def warp_flow(img, flow):
    flow_permute = torch.permute(flow, (0, 2, 3, 1))
    remapped = torch.nn.functional.grid_sample(img, flow_permute)
    return remapped

Unfortunately, remapped when saved as an image, does not return a coherent image. Most images are zero with some looking like bright waves. I'm missing a step in using curr_flow but I don't quite understand what.

Thank you.

答案1

得分: 2

如果我记得正确,RAFT以像素单位输出偏移量,但torch.nn.functional.grid_sample接受[-1,1]范围内的标准化图像坐标。基本上,您需要使用torch.meshgrid生成像素坐标,将RAFT生成的流添加到其中,并将其标准化为[-1,1]。这应该作为grid_sample的输入使用。

英文:

If I remember correctly, RAFT outputs offsets in pixel units, but torch.nn.functional.grid_sample takes normalized image coordinates in [-1,1]. Basically you need to use torch.meshgrid to generate pixel coordinates, add the RAFT generated flow to it, and normalize it to [-1,1]. This should be used as the input to grid_sample.

huangapple
  • 本文由 发表于 2023年5月29日 08:31:24
  • 转载请务必保留本文链接:https://go.coder-hub.com/76354090.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定