在Keras中为自定义损失函数明确计算梯度。

huangapple go评论54阅读模式
英文:

Explicit gradient for custom loss function in keras

问题

I'm working in R and trying to get started with neural networks, using the keras package.

I'd like to use a custom loss function for training my NN. It's possible to do this by writing the custom loss function as lossFn <- function(y_true, y_pred) { ... } and passing it to the compile method as model %>% compile(loss = lossFn, ...).

Now in order to use the gradient descent method of training the NN, the loss function needs to be differentiable. I understand that you'd usually accomplish this by restricting yourself to using backend functions in your loss function, e.g.

lossFn <- function(y_true, y_pred) {
   K <- backend()
   K$mean(K$square(y_true - y_pred), axis = 1L)
}

or something like that.

Now, my problem is that I cannot express my loss function this way; I need to use functions that aren't available in the backend.

So my idea was that I'd work out the gradient myself on paper, and then provide it to compile as another argument, say compile(loss = lossFn, gradient = gradientFn, ...), with gradientFn suitably defined.

The documentation for keras (the R package!) does not indicate that this is possible. At the same time, it does not suggest it's not. And googling has turned up little that is relevant.

So my question is, is it possible?

An addendum: since Google has suggested that there are other training methods for NNs that do not rely on the gradient of the loss function, I should add I'm not too hung up on the specific training method. My ultimate goal isn't to manually supply the gradient of a custom loss function, it's to use a custom loss function to train the NN. The gradient is just a technical obstacle for me right now.

Thanks!

英文:

I'm working in R and trying to get started with neural networks, using the keras package.

I'd like to use a custom loss function for training my NN. It's possible to do this by writing a the custom loss function as lossFn &lt;- function(y_true, y_pred) { ... } and passing it to the compile method as model %&gt;% compile(loss = lossFn, ...).

Now in order to use the gradient descent method of training the NN, the loss function needs to be differentiable. I understand that you'd usually accomplish this by restricting yourself to using backend functions in your loss function, e.g.

lossFn &lt;- function(y_true, y_pred) {
   K &lt;- backend()
   K$mean(K$square(y_true - y_pred), axis = 1L)
}

or something like that.

Now, my problem is that I cannot express my loss function this way; I need to use functions that aren't available in the backend.

So my idea was that I'd work out the gradient myself on paper, and then provide it to compile as another argument, say compile(loss = lossFn, gradient = gradientFn, ...), with gradientFn suitably defined.

The documentation for keras (the R package!) does not indicate that this is possible. At the same time, it does not suggest it's not. And googling has turned up little that is relevant.

So my question is, is it possible?

An addendum: since Google has suggested that there are other training methods for NNs that do not rely on the gradient of the loss function, I should add I'm not too hung up on the specific training method. My ultimate goal isn't to manually supply the gradient of a custom loss function, it's to use a custom loss function to train the NN. The gradient is just a technical obstacle for me right now.

Thanks!

答案1

得分: 1

这在Keras中当然是可能的,你只需要稍微向上移动一下堆栈,实现一个train_step方法,然后调用optimizer$apply_gradients()

《Deep Learning with R》书的第7章涵盖了这种用例:
https://github.com/t-kalinowski/deep-learning-with-R-2nd-edition-code/blob/9f8b6d08dbb8d6565e4f5396e509aaea3e242b84/ch07.R#L608

此外,即使它是用Python编写的,对你在R中的工作也可能有帮助(Python接口与R接口非常相似)。
https://keras.io/guides/writing_a_training_loop_from_scratch/

英文:

This is certainly possible in Keras, you'll just have to move up the stack a little and implement a train_step method and then call optimizer$apply_gradients().

Chapter 7 in the Deep Learning with R book covers this use case:
https://github.com/t-kalinowski/deep-learning-with-R-2nd-edition-code/blob/9f8b6d08dbb8d6565e4f5396e509aaea3e242b84/ch07.R#L608

Also, this keras guide may be useful, even though it's in Python and you're working in R. (The Python interface is very similar to the R interface).
https://keras.io/guides/writing_a_training_loop_from_scratch/

huangapple
  • 本文由 发表于 2023年2月8日 17:09:01
  • 转载请务必保留本文链接:https://go.coder-hub.com/75383439.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定