如何在保留现有参数的情况下更改 torch.nn.Linear 的输出大小?

huangapple go评论75阅读模式
英文:

How do I change the output size of torch.nn.Linear while preserving existing parameters?

问题

我正在编写一个包括以下内容的PyTorch模型:

self.basis_mat=torch.nn.Linear(
in_features=self.basis_mat_params[0],
out_features=self.basis_mat_params[1],
bias=self.basis_mat_params[2]).to(device)

现在我想要做的是动态增加self.basis_matout_features。然而,我也想保留先前训练过的任何参数。换句话说,我们会随机初始化任何新参数,同时保持其余参数不变,而out_features在模型中逐步增加。

那么,我应该怎么做?

我不是很确定要尝试什么... 我查了文档,但结果显示,在torch.nn.Linear中,参数out_features是不可更改的,如果我实例化一个新的torch.nn.Linear(这可能会很慢,所以即使类似的方法有效,我也只会将其视为最后的手段),参数是内置的、随机的,且不可调整。

英文:

I am programming a PyTorch model which includes the following:

self.basis_mat=torch.nn.Linear(
in_features=self.basis_mat_params[0],
out_features=self.basis_mat_params[1],
bias=self.basis_mat_params[2]).to(device)

Now what I want to do is to dynamically increment the out_features of self.basis_mat. However, I also want to preserve any previously trained parameters. In other words, we randomly initialize any new parameters while leaving the rest unchanged, while out_features is being incremented in the model.

So, what should I do?

I didn't really know what to try... I checked the documentation, but it turns out that in torch.nn.Linear the parameter out_features is unchangeable and if I instantiate a new torch.nn.Linear (which would be quite slow I guess, so even if something similar works I would only use this as a last resort) the parameters are inbuilt, random, and unadjustable.

答案1

得分: 0

以下是翻译好的部分:

你可以执行以下操作

    from torch import nn
    import torch
    dim1 = 3
    dim2 = 4
    dim3 = 5
    older_weight_parameters = torch.zeros(dim1, dim2) # 用旧参数替换这里的参数,older_linear_layer.weight.t().detach()
    older_bias_parameters = torch.zeros(dim2) # 用旧参数替换这里的参数,older_linear_layer.bias.detach()
    
    linear_layer = nn.Linear(dim1, dim2 + dim3)
    # 这将使用旧参数值填充相应的参数
    linear_layer.weight.data[:dim2, :] = older_weight_parameters.t()
    linear_layer.bias.data[:dim2] = older_bias_parameters 

        
然后线性层将具有以下权重和偏置参数值您可以看到旧参数的值在示例中为零值已替换了相应位置的随机参数

    linear_layer.weight, linear_layer.bias
    (Parameter containing:
     tensor([[ 0.0000,  0.0000,  0.0000],
             [ 0.0000,  0.0000,  0.0000],
             [ 0.0000,  0.0000,  0.0000],
             [ 0.0000,  0.0000,  0.0000],
             [ 0.3455, -0.3385,  0.4746],
             [-0.2067, -0.4950, -0.3668],
             [-0.4012, -0.3951, -0.3772],
             [-0.1393,  0.3602, -0.0460],
             [-0.4641,  0.2152,  0.4031]], requires_grad=True),
     Parameter containing:
     tensor([ 0.0000,  0.0000,  0.0000,  0.0000,  0.0502, -0.3423,  0.3871, -0.5218,
             -0.4322], requires_grad=True))

线性层的权重和偏置参数与任何其他张量一样您可以对其进行张量操作
英文:

You can do the following,

from torch import nn
import torch 
dim1 = 3
dim2 = 4 
dim3 = 5 
older_weight_parameters = torch.zeros( dim1, dim2) # replace this with older parameters, older_linear_layer.weight.t().detach()
older_bias_parameters = torch.zeros( dim2) # replace this with older parameters, older_linear_layer.bias.detach()

linear_layer = nn.Linear(dim1, dim2 + dim3)
# This will fill the corresponding parameters with the older parameter values
linear_layer.weight.data[:dim2, :] = older_weight_parameters.t()
linear_layer.bias.data[:dim2] = older_bias_parameters 

Then the linear layer will have the following weight and bias parameter values. You can see the values of older parameters (in the examples these are zero values) have replaced the random parameters in the respective places.

linear_layer.weight, linear_layer.bias
(Parameter containing:
 tensor([[ 0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000],
         [ 0.3455, -0.3385,  0.4746],
         [-0.2067, -0.4950, -0.3668],
         [-0.4012, -0.3951, -0.3772],
         [-0.1393,  0.3602, -0.0460],
         [-0.4641,  0.2152,  0.4031]], requires_grad=True),
 Parameter containing:
 tensor([ 0.0000,  0.0000,  0.0000,  0.0000,  0.0502, -0.3423,  0.3871, -0.5218,
         -0.4322], requires_grad=True)

The weights and bias parameters of the Linear layer are like any other tensor. You can use tensor manipulation on it.

huangapple
  • 本文由 发表于 2023年6月13日 12:32:04
  • 转载请务必保留本文链接:https://go.coder-hub.com/76461703.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定