英文:
Tensorflow text classification sample why need from_logits=True?
问题
我正在运行来自TensorFlow的基本文本分类示例这里。
我不明白的一件事是,为什么我们需要在BinaryCrossentropy
损失中使用from_logits=True
?当我尝试删除它并在最后的Dense
层中添加activation="sigmoid"
时,binary_accuracy
在训练时根本不动。
更改后的代码:
model = tf.keras.Sequential([
layers.Embedding(max_features + 1, embedding_dim),
layers.Dropout(0.2),
layers.GlobalAveragePooling1D(),
layers.Dropout(0.2),
layers.Dense(1, activation="sigmoid")]) # <-- 在这里添加 activation = sigmoid
model.compile(loss=losses.BinaryCrossentropy(), # <-- 在这里删除 from_logits=True
optimizer='adam',
metrics=tf.metrics.BinaryAccuracy(threshold=0.0))
epochs = 10
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs)
训练输出:
第1/10个时期
625/625 [==============================] - 4秒 4毫秒/步 - 损失: 0.6635 - 二进制精度: 0.4981 - 验证损失: 0.6149 - 验证二进制精度: 0.5076
第2/10个时期
625/625 [==============================] - 2秒 4毫秒/步 - 损失: 0.5492 - 二进制精度: 0.4981 - 验证损失: 0.4990 - 验证二进制精度: 0.5076
第3/10个时期
625/625 [==============================] - 2秒 4毫秒/步 - 损失: 0.4453 - 二进制精度: 0.4981 - 验证损失: 0.4208 - 验证二进制精度: 0.5076
第4/10个时期
625/625 [==============================] - 2秒 4毫秒/步 - 损失: 0.3792 - 二进制精度: 0.4981 - 验证损失: 0.3741 - 验证二进制精度: 0.5076
第5/10个时期
625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.3360 - 二进制精度: 0.4981 - 验证损失: 0.3454 - 验证二进制精度: 0.5076
第6/10个时期
625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.3054 - 二进制精度: 0.4981 - 验证损失: 0.3262 - 验证二进制精度: 0.5076
第7/10个时期
625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.2813 - 二进制精度: 0.4981 - 验证损失: 0.3126 - 验证二进制精度: 0.5076
第8/10个时期
625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.2616 - 二进制精度: 0.4981 - 验证损失: 0.3033 - 验证二进制精度: 0.5076
第9/10个时期
625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.2456 - 二进制精度: 0.4981 - 验证损失: 0.2967 - 验证二进制精度: 0.5076
第10/10个时期
625/625 [==============================] - 2秒 4毫秒/步 - 损失: 0.2306 - 二进制精度: 0.4981 - 验证损失: 0.2920 - 验证二进制精度: 0.5076
英文:
I’m running a basic text classification sample from Tensorflow here.
One thing I don’t understand is that why we need to use from_logits=True
with BinaryCrossentropy
loss? When I tried to remove it and add activation="sigmoid"
to the last Dense
layer then binary_accuracy
does not move at all when training.
Changed code:
model = tf.keras.Sequential([
layers.Embedding(max_features + 1, embedding_dim),
layers.Dropout(0.2),
layers.GlobalAveragePooling1D(),
layers.Dropout(0.2),
layers.Dense(1, activation="sigmoid")]) # <-- Add activation = sigmoid here
model.compile(loss=losses.BinaryCrossentropy(), # <-- Remove from_logits=True here
optimizer='adam',
metrics=tf.metrics.BinaryAccuracy(threshold=0.0))
epochs = 10
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs)
Training outputs:
Epoch 1/10
625/625 [==============================] - 4s 4ms/step - loss: 0.6635 - binary_accuracy: 0.4981 - val_loss: 0.6149 - val_binary_accuracy: 0.5076
Epoch 2/10
625/625 [==============================] - 2s 4ms/step - loss: 0.5492 - binary_accuracy: 0.4981 - val_loss: 0.4990 - val_binary_accuracy: 0.5076
Epoch 3/10
625/625 [==============================] - 2s 4ms/step - loss: 0.4453 - binary_accuracy: 0.4981 - val_loss: 0.4208 - val_binary_accuracy: 0.5076
Epoch 4/10
625/625 [==============================] - 2s 4ms/step - loss: 0.3792 - binary_accuracy: 0.4981 - val_loss: 0.3741 - val_binary_accuracy: 0.5076
Epoch 5/10
625/625 [==============================] - 3s 4ms/step - loss: 0.3360 - binary_accuracy: 0.4981 - val_loss: 0.3454 - val_binary_accuracy: 0.5076
Epoch 6/10
625/625 [==============================] - 3s 4ms/step - loss: 0.3054 - binary_accuracy: 0.4981 - val_loss: 0.3262 - val_binary_accuracy: 0.5076
Epoch 7/10
625/625 [==============================] - 3s 4ms/step - loss: 0.2813 - binary_accuracy: 0.4981 - val_loss: 0.3126 - val_binary_accuracy: 0.5076
Epoch 8/10
625/625 [==============================] - 3s 4ms/step - loss: 0.2616 - binary_accuracy: 0.4981 - val_loss: 0.3033 - val_binary_accuracy: 0.5076
Epoch 9/10
625/625 [==============================] - 3s 4ms/step - loss: 0.2456 - binary_accuracy: 0.4981 - val_loss: 0.2967 - val_binary_accuracy: 0.5076
Epoch 10/10
625/625 [==============================] - 2s 4ms/step - loss: 0.2306 - binary_accuracy: 0.4981 - val_loss: 0.2920 - val_binary_accuracy: 0.5076
答案1
得分: 0
似乎模型正在正常训练,但目前用于显示模型训练方式的计算方法是错误的。
我认为BinaryAccuracy
中的threshold
参数影响了指标的结果。例如,因为您已将输入更改为损失函数之后的sigmoid
,值将在0
和1
之间变化,但您的BinaryAccuracy
threshold
现在是0.0
,应该是0.5
。
尝试将该值更改为0.5
,如果您想根据自己的需要修改模型架构。
英文:
It seems like the model is being trained as normal, but the calculation method to show you how the model is being trained is wrong at the moment.
I think threshold
in BinaryAccuracy
is accfecting the result of the metrics. For example, because you've changed your input to the loss function as one after the sigmoid
, the values would range between 0
and 1
, but your BinaryAccuracy
threshold
is now 0.0
, which should be 0.5
.
Try change that value to 0.5
, if you want to modify the model archtecture as you want.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论