TensorFlow多类和多标签分类与排名的正确损失函数

huangapple go评论56阅读模式
英文:

TensorFlow right loss function for Multi class and Multi label classification with ranking

问题

我有10个类别,它们从0到9编号。

输出看起来类似于这样:

[0.0, 0.75, 0.0, 1.0, 0.0, 0.875, 0.0, 0.0, 0.0]

上述的实际标签是这样标记的,即第3个索引,也就是类别3,排名第一,第5个类别排名第二,类别1排名第三,其他类别不相关,所以其余的排名为零。换句话说,最大的数值拥有最高的排名,依此类推。我的主要关注点是排名本身,而不是与每个排名对应的具体值,比如0.75等。

方法1 - 回归

最后的稠密层有10个神经元,激活函数为linear,损失函数为keras.losses.MeanSquaredError()
我的模型主要预测为零,因为那是大多数的排名。

方法2 - 多类别分类

最后的稠密层有10个神经元,激活函数为softmax,损失函数为keras.losses.CategoricalCrossentropy()。通过这种方法,我们可以对归一化的预测进行排序,并设置一个阈值,低于该阈值的都排名为零。我只正确获得排名1,但其他排名都被压缩为零。

方法3 - 线性 + 余弦相似度

我使用Linear激活函数和cosine similarity作为损失函数。在训练和验证期间,我看到大多数余弦相似性都很好,都在0.9以上,这意味着余弦相似性在进行很好的梯度下降,但我的排名下游任务不起作用,我只能得到排名1和最后一个排名正确,其他排名都是错误的。

我想知道对于这个问题,什么是正确的激活函数和损失函数?是否有用于排名的自定义损失函数?

英文:

I have 10 classes naming them 0 to 9

The output would look something like this

> [0.0, 0.75, 0.0, 1.0, 0.0, 0.875, 0.0, 0.0, 0.0]

The above actual label is labeled in such a way that the 3rd index, which is also class 3 is Rank one and the 5th class is rank two, class 1 is ranked 3 and other classes are not relevant, so rank zero for the rest of them. In another word, the largest number has the highest rank, and so on. My main focus is on the ranks themselves, and I don't place importance on the specific values, such as 0.75 etc, that correspond to each rank.

Approach 1 - Regression

Last Dense layer with 10 neurons with linear activation function and loss as keras.losses.MeanSquaredError().
My model is predicting mostly zero as that is the majority rank

Approach 2 - Multiclass classification

Last Dense layer with 10 neurons with softmax activation function and loss as keras.losses.CategoricalCrossentropy(). With this approach, we can sort normalized predictions and put a threshold, and below that threshold all rank zero. I am only getting Rank 1 rightly but other ranks are squashed down.

Approach 3 - Linear + cosine similarity
I am having Linear Activation function and cosine similarity as a loss function. Here I see most cosine similarity in training and validation while training is very good all are above 0.9 which means cosine similarity is doing good gradient decent but my downstream task of ranking is not working I just get the rank 1 and the last rank right rest all ranks are wrong.

I want to know what is the right activation function and loss function for this problem. Any custom loss function for ranking ?

答案1

得分: 1

有一个完整的TensorFlow模块专门用于排名问题,你可能会从中汲取灵感。对于你的例子,可以看看:

  1. 均方损失

  2. Sigmoid交叉熵损失

(似乎没有与你的余弦相似度度量相对应的损失)

英文:

There is a whole tensorflow module devoted to ranking problems that you might draw inspiration from. For your examples, perhaps look at:

  1. Mean Squared Loss

  2. Sigmoid Cross Entropy Loss

(there doesn't seem to be an analogous loss for your cosine similarity metric)

答案2

得分: 0

使用 softmax 时,你强制选择一个标签,同时对其他标签进行惩罚。假设其中一个标签是正确的。
在你的情况下,你有一个多标签多类别分类,你可以移除 softmax,对每个标签使用交叉熵,损失将是所有类别损失的总和。如果有应该被忽略的标签,你也可以进行处理。

英文:

When you use the softmax, you are forcing to select one label and also penalize the other. The assumption is that one of the labels are correct.
In you case, you have a multi-label multi-class classifications, you can remove the softmax and use the crossentropy for each label and loss would be the sum of the loss for all classes. if you have labels which should be ignored as well you can manage there.

huangapple
  • 本文由 发表于 2023年2月14日 18:57:09
  • 转载请务必保留本文链接:https://go.coder-hub.com/75446844.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定