TensorFlow文本分类示例为什么需要from_logits=True?

huangapple go评论66阅读模式
英文:

Tensorflow text classification sample why need from_logits=True?

问题

我正在运行来自TensorFlow的基本文本分类示例这里

我不明白的一件事是,为什么我们需要在BinaryCrossentropy损失中使用from_logits=True?当我尝试删除它并在最后的Dense层中添加activation="sigmoid"时,binary_accuracy在训练时根本不动。

更改后的代码:

model = tf.keras.Sequential([
  layers.Embedding(max_features + 1, embedding_dim),
  layers.Dropout(0.2),
  layers.GlobalAveragePooling1D(),
  layers.Dropout(0.2),
  layers.Dense(1, activation="sigmoid")]) # <-- 在这里添加 activation = sigmoid

model.compile(loss=losses.BinaryCrossentropy(), # <-- 在这里删除 from_logits=True
              optimizer='adam',
              metrics=tf.metrics.BinaryAccuracy(threshold=0.0))

epochs = 10
history = model.fit(
    train_ds,
    validation_data=val_ds,
    epochs=epochs)

训练输出:

    第1/10个时期
    625/625 [==============================] - 4秒 4毫秒/步 - 损失: 0.6635 - 二进制精度: 0.4981 - 验证损失: 0.6149 - 验证二进制精度: 0.5076
    第2/10个时期
    625/625 [==============================] - 2秒 4毫秒/步 - 损失: 0.5492 - 二进制精度: 0.4981 - 验证损失: 0.4990 - 验证二进制精度: 0.5076
    第3/10个时期
    625/625 [==============================] - 2秒 4毫秒/步 - 损失: 0.4453 - 二进制精度: 0.4981 - 验证损失: 0.4208 - 验证二进制精度: 0.5076
    第4/10个时期
    625/625 [==============================] - 2秒 4毫秒/步 - 损失: 0.3792 - 二进制精度: 0.4981 - 验证损失: 0.3741 - 验证二进制精度: 0.5076
    第5/10个时期
    625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.3360 - 二进制精度: 0.4981 - 验证损失: 0.3454 - 验证二进制精度: 0.5076
    第6/10个时期
    625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.3054 - 二进制精度: 0.4981 - 验证损失: 0.3262 - 验证二进制精度: 0.5076
    第7/10个时期
    625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.2813 - 二进制精度: 0.4981 - 验证损失: 0.3126 - 验证二进制精度: 0.5076
    第8/10个时期
    625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.2616 - 二进制精度: 0.4981 - 验证损失: 0.3033 - 验证二进制精度: 0.5076
    第9/10个时期
    625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.2456 - 二进制精度: 0.4981 - 验证损失: 0.2967 - 验证二进制精度: 0.5076
    第10/10个时期
    625/625 [==============================] - 2秒 4毫秒/步 - 损失: 0.2306 - 二进制精度: 0.4981 - 验证损失: 0.2920 - 验证二进制精度: 0.5076
英文:

I’m running a basic text classification sample from Tensorflow here.

One thing I don’t understand is that why we need to use from_logits=True with BinaryCrossentropy loss? When I tried to remove it and add activation="sigmoid" to the last Dense layer then binary_accuracy does not move at all when training.

Changed code:

model = tf.keras.Sequential([
  layers.Embedding(max_features + 1, embedding_dim),
  layers.Dropout(0.2),
  layers.GlobalAveragePooling1D(),
  layers.Dropout(0.2),
  layers.Dense(1, activation="sigmoid")]) # <-- Add activation = sigmoid here

model.compile(loss=losses.BinaryCrossentropy(), # <-- Remove from_logits=True here
              optimizer='adam',
              metrics=tf.metrics.BinaryAccuracy(threshold=0.0))

epochs = 10
history = model.fit(
    train_ds,
    validation_data=val_ds,
    epochs=epochs)

Training outputs:

    Epoch 1/10
    625/625 [==============================] - 4s 4ms/step - loss: 0.6635 - binary_accuracy: 0.4981 - val_loss: 0.6149 - val_binary_accuracy: 0.5076
    Epoch 2/10
    625/625 [==============================] - 2s 4ms/step - loss: 0.5492 - binary_accuracy: 0.4981 - val_loss: 0.4990 - val_binary_accuracy: 0.5076
    Epoch 3/10
    625/625 [==============================] - 2s 4ms/step - loss: 0.4453 - binary_accuracy: 0.4981 - val_loss: 0.4208 - val_binary_accuracy: 0.5076
    Epoch 4/10
    625/625 [==============================] - 2s 4ms/step - loss: 0.3792 - binary_accuracy: 0.4981 - val_loss: 0.3741 - val_binary_accuracy: 0.5076
    Epoch 5/10
    625/625 [==============================] - 3s 4ms/step - loss: 0.3360 - binary_accuracy: 0.4981 - val_loss: 0.3454 - val_binary_accuracy: 0.5076
    Epoch 6/10
    625/625 [==============================] - 3s 4ms/step - loss: 0.3054 - binary_accuracy: 0.4981 - val_loss: 0.3262 - val_binary_accuracy: 0.5076
    Epoch 7/10
    625/625 [==============================] - 3s 4ms/step - loss: 0.2813 - binary_accuracy: 0.4981 - val_loss: 0.3126 - val_binary_accuracy: 0.5076
    Epoch 8/10
    625/625 [==============================] - 3s 4ms/step - loss: 0.2616 - binary_accuracy: 0.4981 - val_loss: 0.3033 - val_binary_accuracy: 0.5076
    Epoch 9/10
    625/625 [==============================] - 3s 4ms/step - loss: 0.2456 - binary_accuracy: 0.4981 - val_loss: 0.2967 - val_binary_accuracy: 0.5076
    Epoch 10/10
    625/625 [==============================] - 2s 4ms/step - loss: 0.2306 - binary_accuracy: 0.4981 - val_loss: 0.2920 - val_binary_accuracy: 0.5076

答案1

得分: 0

似乎模型正在正常训练,但目前用于显示模型训练方式的计算方法是错误的。

我认为BinaryAccuracy中的threshold参数影响了指标的结果。例如,因为您已将输入更改为损失函数之后的sigmoid,值将在01之间变化,但您的BinaryAccuracy threshold现在是0.0,应该是0.5

尝试将该值更改为0.5,如果您想根据自己的需要修改模型架构。

英文:

It seems like the model is being trained as normal, but the calculation method to show you how the model is being trained is wrong at the moment.

I think threshold in BinaryAccuracy is accfecting the result of the metrics. For example, because you've changed your input to the loss function as one after the sigmoid, the values would range between 0 and 1, but your BinaryAccuracy threshold is now 0.0, which should be 0.5.

Try change that value to 0.5, if you want to modify the model archtecture as you want.

huangapple
  • 本文由 发表于 2023年6月12日 09:35:40
  • 转载请务必保留本文链接:https://go.coder-hub.com/76453198.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定