2023年6月12日 09:35:40go评论97阅读模式

英文:

Tensorflow text classification sample why need from_logits=True?

问题

我正在运行来自TensorFlow的基本文本分类示例这里。

我不明白的一件事是，为什么我们需要在BinaryCrossentropy损失中使用from_logits=True？当我尝试删除它并在最后的Dense层中添加activation="sigmoid"时，binary_accuracy在训练时根本不动。

更改后的代码:

model = tf.keras.Sequential([
  layers.Embedding(max_features + 1, embedding_dim),
  layers.Dropout(0.2),
  layers.GlobalAveragePooling1D(),
  layers.Dropout(0.2),
  layers.Dense(1, activation="sigmoid")]) # &lt;-- 在这里添加 activation = sigmoid
model.compile(loss=losses.BinaryCrossentropy(), # &lt;-- 在这里删除 from_logits=True
              optimizer='adam',
              metrics=tf.metrics.BinaryAccuracy(threshold=0.0))
epochs = 10
history = model.fit(
    train_ds,
    validation_data=val_ds,
    epochs=epochs)

训练输出:

    第1/10个时期
    625/625 [==============================] - 4秒 4毫秒/步 - 损失: 0.6635 - 二进制精度: 0.4981 - 验证损失: 0.6149 - 验证二进制精度: 0.5076
    第2/10个时期
    625/625 [==============================] - 2秒 4毫秒/步 - 损失: 0.5492 - 二进制精度: 0.4981 - 验证损失: 0.4990 - 验证二进制精度: 0.5076
    第3/10个时期
    625/625 [==============================] - 2秒 4毫秒/步 - 损失: 0.4453 - 二进制精度: 0.4981 - 验证损失: 0.4208 - 验证二进制精度: 0.5076
    第4/10个时期
    625/625 [==============================] - 2秒 4毫秒/步 - 损失: 0.3792 - 二进制精度: 0.4981 - 验证损失: 0.3741 - 验证二进制精度: 0.5076
    第5/10个时期
    625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.3360 - 二进制精度: 0.4981 - 验证损失: 0.3454 - 验证二进制精度: 0.5076
    第6/10个时期
    625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.3054 - 二进制精度: 0.4981 - 验证损失: 0.3262 - 验证二进制精度: 0.5076
    第7/10个时期
    625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.2813 - 二进制精度: 0.4981 - 验证损失: 0.3126 - 验证二进制精度: 0.5076
    第8/10个时期
    625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.2616 - 二进制精度: 0.4981 - 验证损失: 0.3033 - 验证二进制精度: 0.5076
    第9/10个时期
    625/625 [==============================] - 3秒 4毫秒/步 - 损失: 0.2456 - 二进制精度: 0.4981 - 验证损失: 0.2967 - 验证二进制精度: 0.5076
    第10/10个时期
    625/625 [==============================] - 2秒 4毫秒/步 - 损失: 0.2306 - 二进制精度: 0.4981 - 验证损失: 0.2920 - 验证二进制精度: 0.5076

英文:

I’m running a basic text classification sample from Tensorflow here.

One thing I don’t understand is that why we need to use from_logits=True with BinaryCrossentropy loss? When I tried to remove it and add activation="sigmoid" to the last Dense layer then binary_accuracy does not move at all when training.

Changed code:

model = tf.keras.Sequential([
  layers.Embedding(max_features + 1, embedding_dim),
  layers.Dropout(0.2),
  layers.GlobalAveragePooling1D(),
  layers.Dropout(0.2),
  layers.Dense(1, activation=&quot;sigmoid&quot;)]) # &lt;-- Add activation = sigmoid here
model.compile(loss=losses.BinaryCrossentropy(), # &lt;-- Remove from_logits=True here
              optimizer=&#39;adam&#39;,
              metrics=tf.metrics.BinaryAccuracy(threshold=0.0))
epochs = 10
history = model.fit(
    train_ds,
    validation_data=val_ds,
    epochs=epochs)

Training outputs:

    Epoch 1/10
    625/625 [==============================] - 4s 4ms/step - loss: 0.6635 - binary_accuracy: 0.4981 - val_loss: 0.6149 - val_binary_accuracy: 0.5076
    Epoch 2/10
    625/625 [==============================] - 2s 4ms/step - loss: 0.5492 - binary_accuracy: 0.4981 - val_loss: 0.4990 - val_binary_accuracy: 0.5076
    Epoch 3/10
    625/625 [==============================] - 2s 4ms/step - loss: 0.4453 - binary_accuracy: 0.4981 - val_loss: 0.4208 - val_binary_accuracy: 0.5076
    Epoch 4/10
    625/625 [==============================] - 2s 4ms/step - loss: 0.3792 - binary_accuracy: 0.4981 - val_loss: 0.3741 - val_binary_accuracy: 0.5076
    Epoch 5/10
    625/625 [==============================] - 3s 4ms/step - loss: 0.3360 - binary_accuracy: 0.4981 - val_loss: 0.3454 - val_binary_accuracy: 0.5076
    Epoch 6/10
    625/625 [==============================] - 3s 4ms/step - loss: 0.3054 - binary_accuracy: 0.4981 - val_loss: 0.3262 - val_binary_accuracy: 0.5076
    Epoch 7/10
    625/625 [==============================] - 3s 4ms/step - loss: 0.2813 - binary_accuracy: 0.4981 - val_loss: 0.3126 - val_binary_accuracy: 0.5076
    Epoch 8/10
    625/625 [==============================] - 3s 4ms/step - loss: 0.2616 - binary_accuracy: 0.4981 - val_loss: 0.3033 - val_binary_accuracy: 0.5076
    Epoch 9/10
    625/625 [==============================] - 3s 4ms/step - loss: 0.2456 - binary_accuracy: 0.4981 - val_loss: 0.2967 - val_binary_accuracy: 0.5076
    Epoch 10/10
    625/625 [==============================] - 2s 4ms/step - loss: 0.2306 - binary_accuracy: 0.4981 - val_loss: 0.2920 - val_binary_accuracy: 0.5076

答案1

得分: 0

似乎模型正在正常训练，但目前用于显示模型训练方式的计算方法是错误的。

我认为BinaryAccuracy中的threshold参数影响了指标的结果。例如，因为您已将输入更改为损失函数之后的sigmoid，值将在0和1之间变化，但您的BinaryAccuracy threshold现在是0.0，应该是0.5。

尝试将该值更改为0.5，如果您想根据自己的需要修改模型架构。

英文:

It seems like the model is being trained as normal, but the calculation method to show you how the model is being trained is wrong at the moment.

I think threshold in BinaryAccuracy is accfecting the result of the metrics. For example, because you've changed your input to the loss function as one after the sigmoid, the values would range between 0 and 1, but your BinaryAccuracy threshold is now 0.0, which should be 0.5.

Try change that value to 0.5, if you want to modify the model archtecture as you want.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

TensorFlow文本分类示例为什么需要from_logits=True？

问题

答案1

Searching for hidden API to scrape data with Python

好的，以下是翻译好的内容：在VScode中查看矩阵和更高维数组的好方法。

Html Form Posting Values to Django Backend only when submitted twice, or rather hit back to form and then submit again

Selenium：仅在存在时获取部分类上的文本

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。