How does the training variable affects BatchNormalization in tensorflow and what it is its default value?

huangapple go评论44阅读模式
英文:

How does the training variable affects BatchNormalization in tensorflow and what it is its default value?

问题

在 unfreezeng layers 时,建议将变量 training 设置为 False(https://www.tensorflow.org/tutorials/images/transfer_learning?hl=es-419#important_note_about_batchnormalization_layers)。

所以我将其设置为 false

model = efficientnet.EfficientNetB1(input_shape=input_1_shape, include_top=False)

model.trainable = not freeze

x = model_trained(input_1, training=self.training_batch)

但是当我在调试时改变了这个变量,我观察到批标准化层没有发生变化

在两种情况下,它都有 2 个可训练变量,2 个可训练权重,以及 2 个不可训练变量,2 个不可训练权重。

我正在使用 TensorFlow 2.6.2

我想知道我在哪里可以看到我的更改是否生效,或者解释一下实际发生了什么以及该变量的实际默认值是什么。

英文:

When unfreezeng layers with Efficientnet it is recommended to put the variable training in False (https://www.tensorflow.org/tutorials/images/transfer_learning?hl=es-419#important_note_about_batchnormalization_layers).

So i set it to false

model = efficientnet.EfficientNetB1(input_shape=input_1_shape, include_top=False)

model.trainable = not freeze

x = model_trained(input_1, training=self.training_batch)

But when changing the variable I look at the actual layer while debugging and I see no change in the BatchNormalization layers

In both cases it has 2 trainable_variables, 2 trainable_weights and 2 untrainable_variables, 2 untrainable_weights.

I am using Tensorflow 2.6.2

I'd like to know where I could see that my changes are applying or an explanation of what it is actually changing and what it is the actual default value of the variable.

答案1

得分: 0

以下是翻译的内容:

BatchNormalization中的训练变量根据以下条件影响图层是否会对其输入进行归一化:

  1. 当训练变量为True时,该图层将对其输入进行归一化。
  2. 当训练变量为False时,该图层将不对其输入进行归一化。
  3. 训练变量的默认值为True。

唯一的内置图层BatchNormalization既具有可训练权重又具有不可训练权重,与所有其他图层不同。

正如关于BN层的重要说明中所提到的,包含tf.keras.layers.BatchNormalization的模型是一种特殊情况,在微调的上下文中应采取预防措施。

当您设置layer.trainable = False时,BatchNormalization图层将以推理模式运行,不会更新其可训练变量(用于跟踪均值和方差统计数据),意味着在训练期间不再更新均值。

它使用不可训练权重来跟踪其输入的均值和方差。将layer.trainable = False设置为将所有图层的权重从可训练移至不可训练。

您可以使用以下代码检查更新后的可训练权重和不可训练权重:

print(len(model.trainable_weights))
#print(model.trainable_weights)

print(len(model.non_trainable_weights))
#print(model.non_trainable_weights[:5])

输出:

2
260

有关更多详细信息,您可以参考这个链接

英文:

The training variable in BatchNormalization affects based on below criteria whether or not the layer will normalize its inputs:

  1. When the training variable is True, the layer will normalize its inputs.
  2. When the training variable is False, the layer will not normalize its inputs.
  3. The default value of the training variable is True.

The only built-in layer BatchNormalization has both trainable weights and non-trainable weights unlikely all other layer has only trainable weights.<br>
As mentioned in important notes about BN layers, The models which contain tf.keras.layers.BatchNormalization is a special case and precautions should be taken in the context of fine tuning.

When you set layer.trainable = False, The BatchNormalization layer will run in inference mode and will not update its trainable variables (to track mean and variance statistics), means value is no longer updated during training.

It uses non-trainable weights to keep track of the mean and variance of its inputs during training. Setting layer.trainable = False moves all the layer's weights from trainable to non-trainable.

You can check the updated variable weights using :

print(len(model.trainable_weights))
#print(model.trainable_weights)

print(len(model.non_trainable_weights))
#print(model.non_trainable_weights[:5])

Output:

2
260

You can refer to this link for more details.

huangapple
  • 本文由 发表于 2023年5月26日 17:07:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/76339309.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定