自定义多任务模型的损失函数

huangapple go评论53阅读模式
英文:

Custom loss for multi task model

问题

I only want to compute the categorical cross entropy loss for the 3rd output. So I defined a simple custom function:

def my_loss_fn(y_true, y_pred):
    out = y_pred[-1]
    return tf.keras.losses.CategoricalCrossentropy()(y_true, out)

However, tensorflow is complaining that ValueError: Shapes (96, 6) and (5,) are incompatible.

It seems as though y_pred[-1] only returns elements from the final index of the model's first output.

How do I ignore first to model output and only consider the last output to compute the loss?

英文:

I'm finetuning a keras model that outputs 3 different predictions for 3 subtasks. The model output is a list :

out = [[batch_size,5],[batch_size,6],[batch_size,6]]

I only want to compute the categorical cross entropy loss for the 3rd output. So I defined a simple custom function:

def my_loss_fn(y_true, y_pred):
        out = y_pred[-1]
        return tf.keras.losses.CategoricalCrossentropy()(y_true, out) 

However, tensorflow is complaining that ValueError: Shapes (96, 6) and (5,) are incompatible.

It seems as though y_pred[-1] only returns elements from the final index of the model's first output.

How do I ignore first to model output and only consider teh last output to compute the loss ?

答案1

得分: 1

以下是您请求的翻译部分:

We can define loss founction for each output of multi-output model. For that, use naming of the last layers (output layers) of the model. One of a way to achieve this by the following way.

import tensorflow as tf
from tensorflow.keras import utils
import numpy as np  

(xtrain, ytrain), (_, _) = keras.datasets.mnist.load_data()
y_out_a = utils.to_categorical(ytrain, num_classes=10) 
y_out_b = (ytrain % 2 == 0).astype('float32')
y_out_c = tf.square(tf.cast(ytrain, tf.float32))
batch_size = 32
data_image = tf.data.Dataset.from_tensor_slices(
     xtrain[..., None]
)
data_label = tf.data.Dataset.from_tensor_slices(
     (y_out_a, y_out_b, y_out_c)
)
dataset = tf.data.Dataset.zip((data_image, data_label))
dataset = dataset.shuffle(buffer_size=8 * batch_size)
dataset = dataset.batch(batch_size).prefetch(tf.data.AUTOTUNE)

x, y = next(iter(dataset))
y[0].shape, y[1].shape, y[2].shape
(TensorShape([32, 10]), TensorShape([32]), TensorShape([32]))
input = keras.Input(shape=(28, 28, 1))
x = layers.Flatten()(input)
x = layers.Dense(128, activation='relu')(x)
out_a = keras.layers.Dense(10, activation='softmax', name='10cls')(x)
out_b = keras.layers.Dense(1, activation='sigmoid', name='2cls')(x)
out_c = keras.layers.Dense(1, activation='linear', name='1rg')(x)
func_model = keras.Model(
    inputs=[input], outputs=[out_a, out_b, out_c]
)
def categorical(y_true, y_pred):
    return keras.losses.CategoricalCrossentropy()(y_true, y_pred) 

def binary(y_true, y_pred):
    return keras.losses.BinaryCrossentropy()(y_true, y_pred) 

def mse(y_true, y_pred):
    return keras.losses.MeanSquaredError()(y_true, y_pred) 

# compile the model with target loss fn
func_model.compile(
    # you can use what you want
    loss = {
        "10cls": categorical,
        # "2cls": binary,
        # "1rg": mse,
    },
    optimizer = keras.optimizers.Adam()
)

func_model.fit(
    dataset.take(100), 
)
4ms/step - loss: 17.5582 - 10cls_loss: 17.5582

Some resource, this may also help

英文:

We can define loss founction for each output of multi-output model. For that, use naming of the last layers (output layers) of the model. One of a way to achieve this by the following way.

import tensorflow as tf
from tensorflow.keras import utils
import numpy as np  

(xtrain, ytrain), (_, _) = keras.datasets.mnist.load_data()
y_out_a = utils.to_categorical(ytrain, num_classes=10) 
y_out_b = (ytrain % 2 == 0).astype('float32')
y_out_c = tf.square(tf.cast(ytrain, tf.float32))
batch_size = 32
data_image = tf.data.Dataset.from_tensor_slices(
     xtrain[..., None]
)
data_label = tf.data.Dataset.from_tensor_slices(
     (y_out_a, y_out_b, y_out_c)
)
dataset = tf.data.Dataset.zip((data_image, data_label))
dataset = dataset.shuffle(buffer_size=8 * batch_size)
dataset = dataset.batch(batch_size).prefetch(tf.data.AUTOTUNE)

x, y = next(iter(dataset))
y[0].shape, y[1].shape, y[2].shape
(TensorShape([32, 10]), TensorShape([32]), TensorShape([32]))
input = keras.Input(shape=(28, 28, 1))
x = layers.Flatten()(input)
x = layers.Dense(128, activation='relu')(x)
out_a = keras.layers.Dense(10, activation='softmax', name='10cls')(x)
out_b = keras.layers.Dense(1, activation='sigmoid', name='2cls')(x)
out_c = keras.layers.Dense(1, activation='linear', name='1rg')(x)
func_model = keras.Model(
    inputs=[input], outputs=[out_a, out_b, out_c]
)
def categorical(y_true, y_pred):
    return keras.losses.CategoricalCrossentropy()(y_true, y_pred) 

def binary(y_true, y_pred):
    return keras.losses.BinaryCrossentropy()(y_true, y_pred) 

def mse(y_true, y_pred):
    return keras.losses.MeanSquaredError()(y_true, y_pred) 

# compile the model with target loss fn
func_model.compile(
    # you can use what you want
    loss = {
        "10cls": categorical,
        # "2cls": binary,
        # "1rg": mse,
    },
    optimizer = keras.optimizers.Adam()
)

func_model.fit(
    dataset.take(100), 
)
4ms/step - loss: 17.5582 - 10cls_loss: 17.5582

Some resource, this may also help

huangapple
  • 本文由 发表于 2023年3月1日 10:47:17
  • 转载请务必保留本文链接:https://go.coder-hub.com/75599143.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定