2023年6月5日 00:35:54go评论84阅读模式

英文:

Initializing two neural networks from the same class: first initialization influences the second

问题

我是PyTorch的初学者，尝试使用不同隐藏层大小初始化两个来自同一类别的网络来实现学生-教师架构。似乎第一个网络的初始化影响了第二个网络，具体来说，当首先初始化教师网络时，初始化学生网络时会得到不同的损失，尽管我是独立地训练学生网络的。

我的神经网络类使用一个线性层，后跟一个BatchNorm1d层，并且我使用nn.init.uniform_来初始化BatchNorm的权重。所以我猜想这是导致第一次初始化影响第二次的原因，要么是BatchNorm层，要么是线性层保留了第一次初始化的一些运行统计数据。

我尝试使用reset_running_stats()来重置BatchNorm的运行统计数据，但没有改变任何东西。有关如何解决这个问题的任何想法吗？谢谢。

英文:

I'm a begginer with PyTorch and I'm attempting to implement student-teacher architecture by initializing two networks with different hidden sizes from the same class. It seems the first network initialization influences the second one, more specifically I get different losses on the student network when initializing the teacher network first, even though I’m training the student network independently of the teacher.

My NN class uses a Linear layer followed by a BatchNorm1d layer and I'm initializing the BatchNorm weights using nn.init.uniform_. So I’m guessing this is what causes the first initialization to influence the second, either the BatchNorm layer or Linear layer is keeping some running statistics from the first initialization.

I've tried resetting the running stats on the BatchNorm using reset_running_stats() but that didn't change anything. Any ideas on how to solve this? Thanks.

答案1

得分: 0

保证在使用神经网络时获得可重现的结果是相当困难的，因为涉及到大量的随机性。然而，限制随机性的一种方式是设置种子。

这可以在PyTorch中完成：

import torch
torch.manual_seed(seed)  # 种子是您选择的任意数字

您可能会发现，根据初始化的顺序，结果不同，因为两个网络都以某种方式使用相同的随机数生成器。

在处理多个网络时，尝试在实例化模型之前设置种子，以使它们都从随机数生成器接收相同的数字。类似这样：

torch.manual_seed(seed)
student = StudentNetwork()
torch.manual_seed(seed)  # 与前面的调用使用相同的种子
teacher = TeacherNetwork()

英文:

Guaranteeing reproducible results when using neural networks is quite hard by the sheer amount of randomness involved. However, one way to limit the sources of randomness is by setting seeds.

This can be done in pytorch with:

import torch
torch.manual_seed(seed) # seed is any number of your choice

You were probably getting different results depending on the order of initialization because both networks were somehow using the same random number generator.

When dealing with multiple networks, try to set the seed right before instantiating the models to make them both receive the same numbers from the RNG. Something like:

torch.manual_seed(seed)
student = StudentNetwork()
torch.manual_seed(seed) # same seed as previous call
teacher = TeacherNetwork()

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

初始化来自同一类的两个神经网络：第一次初始化会影响第二个。

问题

答案1

无法在Python 3.6.1上安装torch 1.6.0，而PyPi上显示它是兼容的。

更快的方式使用已保存的PyTorch模型（绕过import torch？）

RuntimeError: 预期所有张量位于同一设备上？

Find optimal weights for multivariate model for classification using Machine learning

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。