如何在不丢失 grad_fn 的情况下将 PyTorch 参数用作原始值?

huangapple go评论61阅读模式
英文:

How can I use PyTorch parameters as raw values without losing grad_fn?

问题

I'm trying to create a PyTorch model that can learn the affine transform that turns one image into another

import torch
import torchvision.transforms.functional as VF

class AffineModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.angle = torch.nn.Parameter(torch.rand(1))
        self.translatex = torch.nn.Parameter(torch.rand(1))
        self.translatey = torch.nn.Parameter(torch.rand(1))
        self.scale = torch.nn.Parameter(torch.rand(1))
        self.shear = torch.nn.Parameter(torch.rand(1))
    
    def forward(self, x):
        return VF.affine(x, self.angle.item(), (self.translatex.item(), self.translatey.item()), self.scale.item(), self.shear.item())

Calling forward() fails with TypeError: Argument angle should be int or float. If I try to extract the actual values of the parameters using .item(), the model can't be trained due to RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn.

How do I satisfy the type requirements for the function without detaching the params?

英文:

I'm trying to create a PyTorch model that can learn the affine transform that turns one image into another

import torch
import torchvision.transforms.functional as VF

class AffineModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.angle = torch.nn.Parameter(torch.rand(1))
        self.translatex = torch.nn.Parameter(torch.rand(1))
        self.translatey = torch.nn.Parameter(torch.rand(1))
        self.scale = torch.nn.Parameter(torch.rand(1))
        self.shear = torch.nn.Parameter(torch.rand(1))
    
    def forward(self, x):
        return VF.affine(x, self.angle, (self.translatex, self.translatey), self.scale, self.shear)

Calling forward() fails with TypeError: Argument angle should be int or float. If I try to extract the actual values of the parameters using .item(), the model can't be trained due to RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn.

How do I satisfy the type requirements for the function without detaching the params?

答案1

得分: 1

affine function was not designed to handle tensor inputs for its parameters. It only accepts Python scalars (int or float), and it does not support backpropagation through these parameters, which prevents gradient updates.

Using .item() detaches the tensor from the computation graph, preventing gradients from being computed for that parameter during backpropagation, thus hindering the learning process.

I suggest using PyTorch's affine_grid and grid_sample functions. These functions create an affine transformation in a differentiable manner, which means the network can learn and adjust the parameters of the transformation during training through backpropagation.

import torch
import torch.nn as nn

class AffineModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc = nn.Linear(6, 6) # We'll use a fully connected layer to learn the affine parameters
        self.fc.weight.data = torch.Tensor([[1, 0, 0, 0, 1, 0]]).repeat(6, 1) # Initialize as identity
        self.fc.bias.data = torch.Tensor([0, 0, 0, 0, 0, 0]) # Initialize as zeros

    def forward(self, x):
        theta = self.fc(x.view(x.size(0), -1))
        theta = theta.view(-1, 2, 3)
        grid = torch.nn.functional.affine_grid(theta, x.size())
        x = torch.nn.functional.grid_sample(x, grid)
        return x
英文:

affine function was not designed to handle tensor inputs for its parameters. It only accepts Python scalars (int or float), and it does not support backpropagation through these parameters, which prevents gradient updates.

Using .item() detaches the tensor from the computation graph, preventing gradients from being computed for that parameter during backpropagation, thus hindering the learning process.

I suggest using PyTorch's affine_grid and grid_sample functions. These functions create an affine transformation in a differentiable manner, which means the network can learn and adjust the parameters of the transformation during training through backpropagation.

import torch
import torch.nn as nn

class AffineModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc = nn.Linear(6, 6) # We'll use a fully connected layer to learn the affine parameters
        self.fc.weight.data = torch.Tensor([[1, 0, 0, 0, 1, 0]]).repeat(6, 1) # Initialize as identity
        self.fc.bias.data = torch.Tensor([0, 0, 0, 0, 0, 0]) # Initialize as zeros

    def forward(self, x):
        theta = self.fc(x.view(x.size(0), -1))
        theta = theta.view(-1, 2, 3)
        grid = torch.nn.functional.affine_grid(theta, x.size())
        x = torch.nn.functional.grid_sample(x, grid)
        return x

huangapple
  • 本文由 发表于 2023年5月17日 11:00:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/76268302.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定