英文:
Tensorflow text classification sample why need from_logits=True?
问题
我正在运行来自TensorFlow的基本文本分类示例这里。
我不明白的一件事是,为什么我们需要在BinaryCrossentropy损失中使用from_logits=True?当我尝试删除它并在最后的Dense层中添加activation="sigmoid"时,binary_accuracy在训练时根本不动。
更改后的代码:
model = tf.keras.Sequential([
layers.Embedding(max_features + 1, embedding_dim),
layers.Dropout(0.2),
layers.GlobalAveragePooling1D(),
layers.Dropout(0.2),
layers.Dense(1, activation="sigmoid")]) # <-- 在这里添加 activation = sigmoid
model.compile(loss=losses.BinaryCrossentropy(), # <-- 在这里删除 from_logits=True
optimizer='adam',
metrics=tf.metrics.BinaryAccuracy(threshold=0.0))
epochs = 10
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs)
训练输出:
第1/10个时期
625/625 [==============================] - 4秒 4毫秒/步 - 损失: 0.6635 - 二进制精度: 0.4981 - 验证损失: 0.6149 - 验证二进制精度: 0.5076
第2/10个时期
625/625 [==============================] - 2秒 4毫秒/步 - 损失: 0.5492 - 二进制精度: 0.4981 - 验证损失: 0.4990 - 验证二进制精度: 0.5076
第3/10个时期
625/625 [==============================] - 2秒 4毫秒/步 - 损失: 0.4453 - 二进制精度: 0.4981 - 验证损失: 0.4208 - 验证二进制精度: 0.5076
第4/10个时期
625/625 [==============================] - 2秒 4毫秒/步 - 损失: 0.3792 - 二进制精度: 0.4981 - 验证损失: 0.3741 - 验证二进制精度: 0.5076
第5/10个时期
625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.3360 - 二进制精度: 0.4981 - 验证损失: 0.3454 - 验证二进制精度: 0.5076
第6/10个时期
625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.3054 - 二进制精度: 0.4981 - 验证损失: 0.3262 - 验证二进制精度: 0.5076
第7/10个时期
625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.2813 - 二进制精度: 0.4981 - 验证损失: 0.3126 - 验证二进制精度: 0.5076
第8/10个时期
625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.2616 - 二进制精度: 0.4981 - 验证损失: 0.3033 - 验证二进制精度: 0.5076
第9/10个时期
625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.2456 - 二进制精度: 0.4981 - 验证损失: 0.2967 - 验证二进制精度: 0.5076
第10/10个时期
625/625 [==============================] - 2秒 4毫秒/步 - 损失: 0.2306 - 二进制精度: 0.4981 - 验证损失: 0.2920 - 验证二进制精度: 0.5076
英文:
I’m running a basic text classification sample from Tensorflow here.
One thing I don’t understand is that why we need to use from_logits=True with BinaryCrossentropy loss? When I tried to remove it and add activation="sigmoid" to the last Dense layer then binary_accuracy does not move at all when training.
Changed code:
model = tf.keras.Sequential([
layers.Embedding(max_features + 1, embedding_dim),
layers.Dropout(0.2),
layers.GlobalAveragePooling1D(),
layers.Dropout(0.2),
layers.Dense(1, activation="sigmoid")]) # <-- Add activation = sigmoid here
model.compile(loss=losses.BinaryCrossentropy(), # <-- Remove from_logits=True here
optimizer='adam',
metrics=tf.metrics.BinaryAccuracy(threshold=0.0))
epochs = 10
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs)
Training outputs:
Epoch 1/10
625/625 [==============================] - 4s 4ms/step - loss: 0.6635 - binary_accuracy: 0.4981 - val_loss: 0.6149 - val_binary_accuracy: 0.5076
Epoch 2/10
625/625 [==============================] - 2s 4ms/step - loss: 0.5492 - binary_accuracy: 0.4981 - val_loss: 0.4990 - val_binary_accuracy: 0.5076
Epoch 3/10
625/625 [==============================] - 2s 4ms/step - loss: 0.4453 - binary_accuracy: 0.4981 - val_loss: 0.4208 - val_binary_accuracy: 0.5076
Epoch 4/10
625/625 [==============================] - 2s 4ms/step - loss: 0.3792 - binary_accuracy: 0.4981 - val_loss: 0.3741 - val_binary_accuracy: 0.5076
Epoch 5/10
625/625 [==============================] - 3s 4ms/step - loss: 0.3360 - binary_accuracy: 0.4981 - val_loss: 0.3454 - val_binary_accuracy: 0.5076
Epoch 6/10
625/625 [==============================] - 3s 4ms/step - loss: 0.3054 - binary_accuracy: 0.4981 - val_loss: 0.3262 - val_binary_accuracy: 0.5076
Epoch 7/10
625/625 [==============================] - 3s 4ms/step - loss: 0.2813 - binary_accuracy: 0.4981 - val_loss: 0.3126 - val_binary_accuracy: 0.5076
Epoch 8/10
625/625 [==============================] - 3s 4ms/step - loss: 0.2616 - binary_accuracy: 0.4981 - val_loss: 0.3033 - val_binary_accuracy: 0.5076
Epoch 9/10
625/625 [==============================] - 3s 4ms/step - loss: 0.2456 - binary_accuracy: 0.4981 - val_loss: 0.2967 - val_binary_accuracy: 0.5076
Epoch 10/10
625/625 [==============================] - 2s 4ms/step - loss: 0.2306 - binary_accuracy: 0.4981 - val_loss: 0.2920 - val_binary_accuracy: 0.5076
答案1
得分: 0
似乎模型正在正常训练,但目前用于显示模型训练方式的计算方法是错误的。
我认为BinaryAccuracy中的threshold参数影响了指标的结果。例如,因为您已将输入更改为损失函数之后的sigmoid,值将在0和1之间变化,但您的BinaryAccuracy threshold现在是0.0,应该是0.5。
尝试将该值更改为0.5,如果您想根据自己的需要修改模型架构。
英文:
It seems like the model is being trained as normal, but the calculation method to show you how the model is being trained is wrong at the moment.
I think threshold in BinaryAccuracy is accfecting the result of the metrics. For example, because you've changed your input to the loss function as one after the sigmoid, the values would range between 0 and 1, but your BinaryAccuracy threshold is now 0.0, which should be 0.5.
Try change that value to 0.5, if you want to modify the model archtecture as you want.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。


评论