2023年5月26日 17:07:22go评论112阅读模式

英文:

How does the training variable affects BatchNormalization in tensorflow and what it is its default value?

问题

在 unfreezeng layers 时，建议将变量 training 设置为 False（https://www.tensorflow.org/tutorials/images/transfer_learning?hl=es-419#important_note_about_batchnormalization_layers）。

所以我将其设置为 false

model = efficientnet.EfficientNetB1(input_shape=input_1_shape, include_top=False)
model.trainable = not freeze
x = model_trained(input_1, training=self.training_batch)

但是当我在调试时改变了这个变量，我观察到批标准化层没有发生变化

在两种情况下，它都有 2 个可训练变量，2 个可训练权重，以及 2 个不可训练变量，2 个不可训练权重。

我正在使用 TensorFlow 2.6.2

我想知道我在哪里可以看到我的更改是否生效，或者解释一下实际发生了什么以及该变量的实际默认值是什么。

英文:

When unfreezeng layers with Efficientnet it is recommended to put the variable training in False (https://www.tensorflow.org/tutorials/images/transfer_learning?hl=es-419#important_note_about_batchnormalization_layers).

So i set it to false

model = efficientnet.EfficientNetB1(input_shape=input_1_shape, include_top=False)
model.trainable = not freeze
x = model_trained(input_1, training=self.training_batch)

But when changing the variable I look at the actual layer while debugging and I see no change in the BatchNormalization layers

In both cases it has 2 trainable_variables, 2 trainable_weights and 2 untrainable_variables, 2 untrainable_weights.

I am using Tensorflow 2.6.2

I'd like to know where I could see that my changes are applying or an explanation of what it is actually changing and what it is the actual default value of the variable.

答案1

得分: 0

以下是翻译的内容：

BatchNormalization中的训练变量根据以下条件影响图层是否会对其输入进行归一化：

当训练变量为True时，该图层将对其输入进行归一化。
当训练变量为False时，该图层将不对其输入进行归一化。
训练变量的默认值为True。

唯一的内置图层BatchNormalization既具有可训练权重又具有不可训练权重，与所有其他图层不同。

正如关于BN层的重要说明中所提到的，包含tf.keras.layers.BatchNormalization的模型是一种特殊情况，在微调的上下文中应采取预防措施。

当您设置layer.trainable = False时，BatchNormalization图层将以推理模式运行，不会更新其可训练变量（用于跟踪均值和方差统计数据），意味着在训练期间不再更新均值。

它使用不可训练权重来跟踪其输入的均值和方差。将layer.trainable = False设置为将所有图层的权重从可训练移至不可训练。

您可以使用以下代码检查更新后的可训练权重和不可训练权重：

print(len(model.trainable_weights))
#print(model.trainable_weights)
print(len(model.non_trainable_weights))
#print(model.non_trainable_weights[:5])

输出：

2
260

有关更多详细信息，您可以参考这个链接。

英文:

The training variable in BatchNormalization affects based on below criteria whether or not the layer will normalize its inputs:

When the training variable is True, the layer will normalize its inputs.
When the training variable is False, the layer will not normalize its inputs.
The default value of the training variable is True.

The only built-in layer BatchNormalization has both trainable weights and non-trainable weights unlikely all other layer has only trainable weights.<br>
As mentioned in important notes about BN layers, The models which contain tf.keras.layers.BatchNormalization is a special case and precautions should be taken in the context of fine tuning.

When you set layer.trainable = False, The BatchNormalization layer will run in inference mode and will not update its trainable variables (to track mean and variance statistics), means value is no longer updated during training.

It uses non-trainable weights to keep track of the mean and variance of its inputs during training. Setting layer.trainable = False moves all the layer's weights from trainable to non-trainable.

You can check the updated variable weights using :

print(len(model.trainable_weights))
#print(model.trainable_weights)
print(len(model.non_trainable_weights))
#print(model.non_trainable_weights[:5])

Output:

2
260

You can refer to this link for more details.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

How does the training variable affects BatchNormalization in tensorflow and what it is its default value?

问题

答案1

在Python中，使用多个CSV文件的数据将新行附加到现有的Excel表格中。

在多层次数据框中创建一个条件列

Python 转为可执行文件后自动关闭

从大量的数据中高效地找出重叠的范围

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。