2023年2月8日 17:09:01go评论86阅读模式

英文:

Explicit gradient for custom loss function in keras

问题

I'm working in R and trying to get started with neural networks, using the keras package.

I'd like to use a custom loss function for training my NN. It's possible to do this by writing the custom loss function as lossFn <- function(y_true, y_pred) { ... } and passing it to the compile method as model %>% compile(loss = lossFn, ...).

Now in order to use the gradient descent method of training the NN, the loss function needs to be differentiable. I understand that you'd usually accomplish this by restricting yourself to using backend functions in your loss function, e.g.

lossFn <- function(y_true, y_pred) {
   K <- backend()
   K$mean(K$square(y_true - y_pred), axis = 1L)
}

or something like that.

Now, my problem is that I cannot express my loss function this way; I need to use functions that aren't available in the backend.

So my idea was that I'd work out the gradient myself on paper, and then provide it to compile as another argument, say compile(loss = lossFn, gradient = gradientFn, ...), with gradientFn suitably defined.

The documentation for keras (the R package!) does not indicate that this is possible. At the same time, it does not suggest it's not. And googling has turned up little that is relevant.

So my question is, is it possible?

An addendum: since Google has suggested that there are other training methods for NNs that do not rely on the gradient of the loss function, I should add I'm not too hung up on the specific training method. My ultimate goal isn't to manually supply the gradient of a custom loss function, it's to use a custom loss function to train the NN. The gradient is just a technical obstacle for me right now.

Thanks!

英文:

I'm working in R and trying to get started with neural networks, using the keras package.

I'd like to use a custom loss function for training my NN. It's possible to do this by writing a the custom loss function as lossFn <- function(y_true, y_pred) { ... } and passing it to the compile method as model %>% compile(loss = lossFn, ...).

lossFn &lt;- function(y_true, y_pred) {
   K &lt;- backend()
   K$mean(K$square(y_true - y_pred), axis = 1L)
}

or something like that.

Now, my problem is that I cannot express my loss function this way; I need to use functions that aren't available in the backend.

The documentation for keras (the R package!) does not indicate that this is possible. At the same time, it does not suggest it's not. And googling has turned up little that is relevant.

So my question is, is it possible?

Thanks!

答案1

得分: 1

这在Keras中当然是可能的，你只需要稍微向上移动一下堆栈，实现一个train_step方法，然后调用optimizer$apply_gradients()。

《Deep Learning with R》书的第7章涵盖了这种用例：
https://github.com/t-kalinowski/deep-learning-with-R-2nd-edition-code/blob/9f8b6d08dbb8d6565e4f5396e509aaea3e242b84/ch07.R#L608

此外，即使它是用Python编写的，对你在R中的工作也可能有帮助（Python接口与R接口非常相似）。
https://keras.io/guides/writing_a_training_loop_from_scratch/

英文:

This is certainly possible in Keras, you'll just have to move up the stack a little and implement a train_step method and then call optimizer$apply_gradients().

Chapter 7 in the Deep Learning with R book covers this use case:
https://github.com/t-kalinowski/deep-learning-with-R-2nd-edition-code/blob/9f8b6d08dbb8d6565e4f5396e509aaea3e242b84/ch07.R#L608

Also, this keras guide may be useful, even though it's in Python and you're working in R. (The Python interface is very similar to the R interface).
https://keras.io/guides/writing_a_training_loop_from_scratch/

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在Keras中为自定义损失函数明确计算梯度。

问题

答案1

可视化两个分类变量并按特定条件筛选数据

如何使用dplyr合并具有NA的列？

将数据框进行数据透视，使用动态列名且不进行聚合。

填充 ggplot2 图的区域是否可以根据每个 x 轴坐标处的值进行更改？

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。