英文:
How can I use PyTorch parameters as raw values without losing grad_fn?
问题
I'm trying to create a PyTorch model that can learn the affine transform that turns one image into another
import torch
import torchvision.transforms.functional as VF
class AffineModel(nn.Module):
def __init__(self):
super().__init__()
self.angle = torch.nn.Parameter(torch.rand(1))
self.translatex = torch.nn.Parameter(torch.rand(1))
self.translatey = torch.nn.Parameter(torch.rand(1))
self.scale = torch.nn.Parameter(torch.rand(1))
self.shear = torch.nn.Parameter(torch.rand(1))
def forward(self, x):
return VF.affine(x, self.angle.item(), (self.translatex.item(), self.translatey.item()), self.scale.item(), self.shear.item())
Calling forward()
fails with TypeError: Argument angle should be int or float
. If I try to extract the actual values of the parameters using .item()
, the model can't be trained due to RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
.
How do I satisfy the type requirements for the function without detaching the params?
英文:
I'm trying to create a PyTorch model that can learn the affine transform that turns one image into another
import torch
import torchvision.transforms.functional as VF
class AffineModel(nn.Module):
def __init__(self):
super().__init__()
self.angle = torch.nn.Parameter(torch.rand(1))
self.translatex = torch.nn.Parameter(torch.rand(1))
self.translatey = torch.nn.Parameter(torch.rand(1))
self.scale = torch.nn.Parameter(torch.rand(1))
self.shear = torch.nn.Parameter(torch.rand(1))
def forward(self, x):
return VF.affine(x, self.angle, (self.translatex, self.translatey), self.scale, self.shear)
Calling forward()
fails with TypeError: Argument angle should be int or float
. If I try to extract the actual values of the parameters using .item()
, the model can't be trained due to RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
.
How do I satisfy the type requirements for the function without detaching the params?
答案1
得分: 1
affine function was not designed to handle tensor inputs for its parameters. It only accepts Python scalars (int or float), and it does not support backpropagation through these parameters, which prevents gradient updates.
Using .item() detaches the tensor from the computation graph, preventing gradients from being computed for that parameter during backpropagation, thus hindering the learning process.
I suggest using PyTorch's affine_grid and grid_sample functions. These functions create an affine transformation in a differentiable manner, which means the network can learn and adjust the parameters of the transformation during training through backpropagation.
import torch
import torch.nn as nn
class AffineModel(nn.Module):
def __init__(self):
super().__init__()
self.fc = nn.Linear(6, 6) # We'll use a fully connected layer to learn the affine parameters
self.fc.weight.data = torch.Tensor([[1, 0, 0, 0, 1, 0]]).repeat(6, 1) # Initialize as identity
self.fc.bias.data = torch.Tensor([0, 0, 0, 0, 0, 0]) # Initialize as zeros
def forward(self, x):
theta = self.fc(x.view(x.size(0), -1))
theta = theta.view(-1, 2, 3)
grid = torch.nn.functional.affine_grid(theta, x.size())
x = torch.nn.functional.grid_sample(x, grid)
return x
英文:
affine function was not designed to handle tensor inputs for its parameters. It only accepts Python scalars (int or float), and it does not support backpropagation through these parameters, which prevents gradient updates.
Using .item() detaches the tensor from the computation graph, preventing gradients from being computed for that parameter during backpropagation, thus hindering the learning process.
I suggest using PyTorch's affine_grid and grid_sample functions. These functions create an affine transformation in a differentiable manner, which means the network can learn and adjust the parameters of the transformation during training through backpropagation.
import torch
import torch.nn as nn
class AffineModel(nn.Module):
def __init__(self):
super().__init__()
self.fc = nn.Linear(6, 6) # We'll use a fully connected layer to learn the affine parameters
self.fc.weight.data = torch.Tensor([[1, 0, 0, 0, 1, 0]]).repeat(6, 1) # Initialize as identity
self.fc.bias.data = torch.Tensor([0, 0, 0, 0, 0, 0]) # Initialize as zeros
def forward(self, x):
theta = self.fc(x.view(x.size(0), -1))
theta = theta.view(-1, 2, 3)
grid = torch.nn.functional.affine_grid(theta, x.size())
x = torch.nn.functional.grid_sample(x, grid)
return x
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论