TensorFlow文本分类示例为什么需要from_logits=True?

huangapple go评论97阅读模式
英文:

Tensorflow text classification sample why need from_logits=True?

问题

我正在运行来自TensorFlow的基本文本分类示例这里

我不明白的一件事是,为什么我们需要在BinaryCrossentropy损失中使用from_logits=True?当我尝试删除它并在最后的Dense层中添加activation="sigmoid"时,binary_accuracy在训练时根本不动。

更改后的代码:

  1. model = tf.keras.Sequential([
  2. layers.Embedding(max_features + 1, embedding_dim),
  3. layers.Dropout(0.2),
  4. layers.GlobalAveragePooling1D(),
  5. layers.Dropout(0.2),
  6. layers.Dense(1, activation="sigmoid")]) # <-- 在这里添加 activation = sigmoid
  7. model.compile(loss=losses.BinaryCrossentropy(), # <-- 在这里删除 from_logits=True
  8. optimizer='adam',
  9. metrics=tf.metrics.BinaryAccuracy(threshold=0.0))
  10. epochs = 10
  11. history = model.fit(
  12. train_ds,
  13. validation_data=val_ds,
  14. epochs=epochs)

训练输出:

  1. 1/10个时期
  2. 625/625 [==============================] - 4 4毫秒/步 - 损失: 0.6635 - 二进制精度: 0.4981 - 验证损失: 0.6149 - 验证二进制精度: 0.5076
  3. 2/10个时期
  4. 625/625 [==============================] - 2 4毫秒/步 - 损失: 0.5492 - 二进制精度: 0.4981 - 验证损失: 0.4990 - 验证二进制精度: 0.5076
  5. 3/10个时期
  6. 625/625 [==============================] - 2 4毫秒/步 - 损失: 0.4453 - 二进制精度: 0.4981 - 验证损失: 0.4208 - 验证二进制精度: 0.5076
  7. 4/10个时期
  8. 625/625 [==============================] - 2 4毫秒/步 - 损失: 0.3792 - 二进制精度: 0.4981 - 验证损失: 0.3741 - 验证二进制精度: 0.5076
  9. 5/10个时期
  10. 625/625 [==============================] - 3 4毫秒/步 - 损失: 0.3360 - 二进制精度: 0.4981 - 验证损失: 0.3454 - 验证二进制精度: 0.5076
  11. 6/10个时期
  12. 625/625 [==============================] - 3 4毫秒/步 - 损失: 0.3054 - 二进制精度: 0.4981 - 验证损失: 0.3262 - 验证二进制精度: 0.5076
  13. 7/10个时期
  14. 625/625 [==============================] - 3 4毫秒/步 - 损失: 0.2813 - 二进制精度: 0.4981 - 验证损失: 0.3126 - 验证二进制精度: 0.5076
  15. 8/10个时期
  16. 625/625 [==============================] - 3 4毫秒/步 - 损失: 0.2616 - 二进制精度: 0.4981 - 验证损失: 0.3033 - 验证二进制精度: 0.5076
  17. 9/10个时期
  18. 625/625 [==============================] - 3 4毫秒/步 - 损失: 0.2456 - 二进制精度: 0.4981 - 验证损失: 0.2967 - 验证二进制精度: 0.5076
  19. 10/10个时期
  20. 625/625 [==============================] - 2 4毫秒/步 - 损失: 0.2306 - 二进制精度: 0.4981 - 验证损失: 0.2920 - 验证二进制精度: 0.5076
英文:

I’m running a basic text classification sample from Tensorflow here.

One thing I don’t understand is that why we need to use from_logits=True with BinaryCrossentropy loss? When I tried to remove it and add activation="sigmoid" to the last Dense layer then binary_accuracy does not move at all when training.

Changed code:

  1. model = tf.keras.Sequential([
  2. layers.Embedding(max_features + 1, embedding_dim),
  3. layers.Dropout(0.2),
  4. layers.GlobalAveragePooling1D(),
  5. layers.Dropout(0.2),
  6. layers.Dense(1, activation="sigmoid")]) # <-- Add activation = sigmoid here
  7. model.compile(loss=losses.BinaryCrossentropy(), # <-- Remove from_logits=True here
  8. optimizer='adam',
  9. metrics=tf.metrics.BinaryAccuracy(threshold=0.0))
  10. epochs = 10
  11. history = model.fit(
  12. train_ds,
  13. validation_data=val_ds,
  14. epochs=epochs)

Training outputs:

  1. Epoch 1/10
  2. 625/625 [==============================] - 4s 4ms/step - loss: 0.6635 - binary_accuracy: 0.4981 - val_loss: 0.6149 - val_binary_accuracy: 0.5076
  3. Epoch 2/10
  4. 625/625 [==============================] - 2s 4ms/step - loss: 0.5492 - binary_accuracy: 0.4981 - val_loss: 0.4990 - val_binary_accuracy: 0.5076
  5. Epoch 3/10
  6. 625/625 [==============================] - 2s 4ms/step - loss: 0.4453 - binary_accuracy: 0.4981 - val_loss: 0.4208 - val_binary_accuracy: 0.5076
  7. Epoch 4/10
  8. 625/625 [==============================] - 2s 4ms/step - loss: 0.3792 - binary_accuracy: 0.4981 - val_loss: 0.3741 - val_binary_accuracy: 0.5076
  9. Epoch 5/10
  10. 625/625 [==============================] - 3s 4ms/step - loss: 0.3360 - binary_accuracy: 0.4981 - val_loss: 0.3454 - val_binary_accuracy: 0.5076
  11. Epoch 6/10
  12. 625/625 [==============================] - 3s 4ms/step - loss: 0.3054 - binary_accuracy: 0.4981 - val_loss: 0.3262 - val_binary_accuracy: 0.5076
  13. Epoch 7/10
  14. 625/625 [==============================] - 3s 4ms/step - loss: 0.2813 - binary_accuracy: 0.4981 - val_loss: 0.3126 - val_binary_accuracy: 0.5076
  15. Epoch 8/10
  16. 625/625 [==============================] - 3s 4ms/step - loss: 0.2616 - binary_accuracy: 0.4981 - val_loss: 0.3033 - val_binary_accuracy: 0.5076
  17. Epoch 9/10
  18. 625/625 [==============================] - 3s 4ms/step - loss: 0.2456 - binary_accuracy: 0.4981 - val_loss: 0.2967 - val_binary_accuracy: 0.5076
  19. Epoch 10/10
  20. 625/625 [==============================] - 2s 4ms/step - loss: 0.2306 - binary_accuracy: 0.4981 - val_loss: 0.2920 - val_binary_accuracy: 0.5076

答案1

得分: 0

似乎模型正在正常训练,但目前用于显示模型训练方式的计算方法是错误的。

我认为BinaryAccuracy中的threshold参数影响了指标的结果。例如,因为您已将输入更改为损失函数之后的sigmoid,值将在01之间变化,但您的BinaryAccuracy threshold现在是0.0,应该是0.5

尝试将该值更改为0.5,如果您想根据自己的需要修改模型架构。

英文:

It seems like the model is being trained as normal, but the calculation method to show you how the model is being trained is wrong at the moment.

I think threshold in BinaryAccuracy is accfecting the result of the metrics. For example, because you've changed your input to the loss function as one after the sigmoid, the values would range between 0 and 1, but your BinaryAccuracy threshold is now 0.0, which should be 0.5.

Try change that value to 0.5, if you want to modify the model archtecture as you want.

huangapple
  • 本文由 发表于 2023年6月12日 09:35:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/76453198.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定