2023年6月13日 12:32:04go评论100阅读模式

英文:

How do I change the output size of torch.nn.Linear while preserving existing parameters?

问题

我正在编写一个包括以下内容的PyTorch模型：

self.basis_mat=torch.nn.Linear(
in_features=self.basis_mat_params[0],
out_features=self.basis_mat_params[1],
bias=self.basis_mat_params[2]).to(device)

现在我想要做的是动态增加self.basis_mat的out_features。然而，我也想保留先前训练过的任何参数。换句话说，我们会随机初始化任何新参数，同时保持其余参数不变，而out_features在模型中逐步增加。

那么，我应该怎么做？

我不是很确定要尝试什么... 我查了文档，但结果显示，在torch.nn.Linear中，参数out_features是不可更改的，如果我实例化一个新的torch.nn.Linear（这可能会很慢，所以即使类似的方法有效，我也只会将其视为最后的手段），参数是内置的、随机的，且不可调整。

英文:

I am programming a PyTorch model which includes the following:

self.basis_mat=torch.nn.Linear(
in_features=self.basis_mat_params[0],
out_features=self.basis_mat_params[1],
bias=self.basis_mat_params[2]).to(device)

Now what I want to do is to dynamically increment the out_features of self.basis_mat. However, I also want to preserve any previously trained parameters. In other words, we randomly initialize any new parameters while leaving the rest unchanged, while out_features is being incremented in the model.

So, what should I do?

I didn't really know what to try... I checked the documentation, but it turns out that in torch.nn.Linear the parameter out_features is unchangeable and if I instantiate a new torch.nn.Linear (which would be quite slow I guess, so even if something similar works I would only use this as a last resort) the parameters are inbuilt, random, and unadjustable.

答案1

得分: 0

以下是翻译好的部分：

你可以执行以下操作，
    from torch import nn
    import torch
    dim1 = 3
    dim2 = 4
    dim3 = 5
    older_weight_parameters = torch.zeros(dim1, dim2) # 用旧参数替换这里的参数，older_linear_layer.weight.t().detach()
    older_bias_parameters = torch.zeros(dim2) # 用旧参数替换这里的参数，older_linear_layer.bias.detach()
    
    linear_layer = nn.Linear(dim1, dim2 + dim3)
    # 这将使用旧参数值填充相应的参数
    linear_layer.weight.data[:dim2, :] = older_weight_parameters.t()
    linear_layer.bias.data[:dim2] = older_bias_parameters 
        
然后线性层将具有以下权重和偏置参数值。您可以看到旧参数的值（在示例中为零值）已替换了相应位置的随机参数。
    linear_layer.weight, linear_layer.bias
    (Parameter containing:
     tensor([[ 0.0000,  0.0000,  0.0000],
             [ 0.0000,  0.0000,  0.0000],
             [ 0.0000,  0.0000,  0.0000],
             [ 0.0000,  0.0000,  0.0000],
             [ 0.3455, -0.3385,  0.4746],
             [-0.2067, -0.4950, -0.3668],
             [-0.4012, -0.3951, -0.3772],
             [-0.1393,  0.3602, -0.0460],
             [-0.4641,  0.2152,  0.4031]], requires_grad=True),
     Parameter containing:
     tensor([ 0.0000,  0.0000,  0.0000,  0.0000,  0.0502, -0.3423,  0.3871, -0.5218,
             -0.4322], requires_grad=True))
线性层的权重和偏置参数与任何其他张量一样。您可以对其进行张量操作。

英文:

You can do the following,

from torch import nn
import torch 
dim1 = 3
dim2 = 4 
dim3 = 5 
older_weight_parameters = torch.zeros( dim1, dim2) # replace this with older parameters, older_linear_layer.weight.t().detach()
older_bias_parameters = torch.zeros( dim2) # replace this with older parameters, older_linear_layer.bias.detach()
linear_layer = nn.Linear(dim1, dim2 + dim3)
# This will fill the corresponding parameters with the older parameter values
linear_layer.weight.data[:dim2, :] = older_weight_parameters.t()
linear_layer.bias.data[:dim2] = older_bias_parameters

Then the linear layer will have the following weight and bias parameter values. You can see the values of older parameters (in the examples these are zero values) have replaced the random parameters in the respective places.

linear_layer.weight, linear_layer.bias
(Parameter containing:
 tensor([[ 0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000],
         [ 0.0000,  0.0000,  0.0000],
         [ 0.3455, -0.3385,  0.4746],
         [-0.2067, -0.4950, -0.3668],
         [-0.4012, -0.3951, -0.3772],
         [-0.1393,  0.3602, -0.0460],
         [-0.4641,  0.2152,  0.4031]], requires_grad=True),
 Parameter containing:
 tensor([ 0.0000,  0.0000,  0.0000,  0.0000,  0.0502, -0.3423,  0.3871, -0.5218,
         -0.4322], requires_grad=True)

The weights and bias parameters of the Linear layer are like any other tensor. You can use tensor manipulation on it.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何在保留现有参数的情况下更改 torch.nn.Linear 的输出大小？

问题

答案1

Can't I set the number with a decimal part to "MinMoneyValidator()" and "MaxMoneyValidator()" in "MoneyField()" with Django-money?

无法使用Python Selenium提取跨度文本内容。

如何在函数调用中更新文本界面用户界面（Textual TUI）？

Polars – 从S3读取Parquet只读取第一个文件

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。