2020年1月6日 17:46:19go评论57阅读模式

英文:

Weighted Pixel Wise Categorical Cross Entropy for Semantic Segmentation

问题

I have recently started learning about Semantic Segmentation. I am trying to train a UNet for the same. My input is RGB 128x128x3 images. My masks are made up of 4 classes 0, 1, 2, 3 and are One-Hot Encoded with dimension 128x128x4.

这是我正在使用的损失函数，但它将每个像素都分类为2。我做错了什么？

英文:

def weighted_cce(y_true, y_pred):
        weights = []
        t_inf = tf.convert_to_tensor(1e9, dtype = &#39;float32&#39;)
        t_zero = tf.convert_to_tensor(0, dtype = &#39;int64&#39;)
        for i in range(0, 4):
            l = tf.argmax(y_true, axis = -1) == i
            n = tf.cast(tf.math.count_nonzero(l), &#39;float32&#39;) + K.epsilon()
            weights.append(n)

        weights = [batch_size/j for j in weights]
    
        y_pred /= K.sum(y_pred, axis=-1, keepdims=True)
        # clip to prevent NaN&#39;s and Inf&#39;s
        y_pred = K.clip(y_pred, K.epsilon(), 1 - K.epsilon())
        # calc
        loss = y_true * K.log(y_pred) * weights
        loss = -K.sum(loss, -1)
        return loss

This is the loss function that I am using but it classifies every pixel as 2. What am I doing wrong?

答案1

得分: 0

以下是您要翻译的内容：

你应该根据你的整个数据集来分配权重（除非你的批处理大小足够大，以便具有相对稳定的权重）。

如果某个类别在样本中被低估，在批处理大小较小的情况下，它的权重将接近无穷大。

如果你的目标数据是numpy数组：

shp = y_train.shape
totalPixels = shp[0] * shp[1] * shp[2]

weights = np.sum(y_train, axis=(0, 1, 2)) #最终形状为(4,)
weights = totalPixels/weights

如果你的数据在一个Sequence生成器中：

totalPixels = 0
counts = np.zeros((4,))

for i in range(len(generator)):
    x, y = generator[i]

    shp = y.shape
    totalPixels += shp[0] * shp[1] * shp[2]

    counts = counts + np.sum(y, axis=(0,1,2))

weights = totalPixels / counts

如果你的数据在一个yield生成器中（你必须知道每个epoch中有多少批次）：

for i in range(batches_per_epoch):
    x, y = next(generator)
    #其余部分与上面的Sequence示例相同

尝试1：

我不知道Keras的新版本是否能处理这个问题，但你可以首先尝试最简单的方法：只需在调用fit或fit_generator时使用class_weight参数：

model.fit(...., class_weight = {0: weights[0], 1: weights[1], 2: weights[2], 3: weights[3]})

尝试2：

创建一个更健康的损失函数：

weights = weights.reshape((1,1,1,4))
kWeights = K.constant(weights)

def weighted_cce(y_true, y_pred):
    yWeights = kWeights * y_pred         #形状为(batch, 128, 128, 4)
    yWeights = K.sum(yWeights, axis=-1)  #形状为(batch, 128, 128)  

    loss = K.categorical_crossentropy(y_true, y_pred) #形状为(batch, 128, 128)
    wLoss = yWeights * loss

    return K.sum(wLoss, axis=(1,2))

英文:

You should have weights based on you entire data (unless your batch size is reasonably big so you have sort of stable weights).

If some class is underrepresented, with a small batch size, it will have near infinity weights.

If your target data is numpy array:

shp = y_train.shape
totalPixels = shp[0] * shp[1] * shp[2]

weights = np.sum(y_train, axis=(0, 1, 2)) #final shape (4,)
weights = totalPixels/weights

If your data is in a Sequence generator:

totalPixels = 0
counts = np.zeros((4,))

for i in range(len(generator)):
    x, y = generator[i]

    shp = y.shape
    totalPixels += shp[0] * shp[1] * shp[2]

    counts = counts + np.sum(y, axis=(0,1,2))

weights = totalPixels / counts

If your data is in a yield generator (you must know how many batches you have in an epoch):

for i in range(batches_per_epoch):
    x, y = next(generator)
    #the rest is equal to the Sequence example above

Attempt 1

I don't know if newer versions of Keras are able to handle this, but you can try the simplest approach first: simply call fit or fit_generator with the class_weight argument:

model.fit(...., class_weight = {0: weights[0], 1: weights[1], 2: weights[2], 3: weights[3]})

Attempt 2

Make a healthier loss function:

weights = weights.reshape((1,1,1,4))
kWeights = K.constant(weights)

def weighted_cce(y_true, y_pred):
    yWeights = kWeights * y_pred         #shape (batch, 128, 128, 4)
    yWeights = K.sum(yWeights, axis=-1)  #shape (batch, 128, 128)  

    loss = K.categorical_crossentropy(y_true, y_pred) #shape (batch, 128, 128)
    wLoss = yWeights * loss

    return K.sum(wLoss, axis=(1,2))

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

像素加权分类交叉熵用于语义分割

问题

答案1

Attempt 1

Attempt 2

使用在fiftyone.zoo数据集上训练的YOLO v8来在视频中跟踪鼠标（动物）。

遇到问题在正确导入tensorflow Tokenizer和tensorflow padded_sequences。

在深度学习中，当批处理大小减小时，是否可以提高预测速度？

如何将输入数据传递给现有的 TensorFlow 2.x 模型（使用 Java）？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论