2023年6月13日 02:30:34go评论63阅读模式

英文:

Can Pytorch autograd compute gradient with respect to only one parameter in neural network?

问题

我正在尝试找到损失函数相对于神经网络的梯度（特别是神经网络中一个节点的梯度）。

由于我的问题类型特殊，我只想计算相对于神经网络中的一个节点的梯度。例如，假设我们有损失（L）和神经网络参数theta。然后，我只想找到dL/dTheta_1，其中theta_1是神经网络中的任何一个节点。

通常，我们使用 grad = torch.autograd.grad(Loss, parameters()) 来找到梯度 dL/dTheta = [dL/dTheta_1, dL/dTheta_2 ... dL/dTheta_n]，但我只想要dL/dTheta_1以减少计算成本。

是否可以使用Pytorch编写这个功能呢？

从理论上讲，我认为可以计算一个梯度分量，但我不确定Pytorch是否具有这个选项。

有没有人对此有想法？

英文:

I am trying to find the gradient of Loss function with respect to its neural network (specifically gradient of one node in neural net).

Due to my special type of problem, I only want to compute the gradient with respect to one node in neural network. For example, say we have Loss (L) and neural network parameters theta. Then, I only want to find dL/dTheta_1 where theta_1 is the any ONE node in the neural net.

We typically use grad = torch.autograd.grad(Loss, parameters()) to find gradients of dL/dTheta = [dL/dTheta_1, dL/dTheta_2 ... dL/dTheta_n] but I only want dL/dTheta_1 to reduce the computational cost.

Would it be possible to code it with Pytorch?

Theoretically, I think it is possible to compute only one gradient component but I am not sure Pytorch has the option for that.

Does anyone have a idea on this?

答案1

得分: 0

如果你指的是权重张量的单个组件，我认为无法完成。Pytorch使用前向图来执行反向传播，因此只能计算图元素的梯度，而theta[0]不是图元素之一。你可以尝试使用自动微分来获取直接使用theta的所有操作的输出梯度，并通过链式法则自行计算dL/dTheta_1。或者你可以在前向传递过程中从其他组件中将Theta_1分离出来并单独处理，但这会降低前向传递的效率。但请记住，如果在Theta和L之间有多个层次，你无法避免计算它们的完整梯度，因此不太可能节省太多时间。

英文:

If by a node you mean a single component of a weight tensor, I think it can't be done. Pytorch uses forward graph to perform backward pass, so it can calculate gradients only for graph elements, which theta[0] is not. You could try using autograd to get gradients of outputs of all operations that use theta directly and compute dL/dTheta_1 yourself from them by chain rule. Or you could "chip off" Theta_1 during forward pass from other components and process it separately, which would reduce efficiency of forward pass. But keep in mind that if there are several layers between Theta and L, you can't avoid calculating their full gradients, so it's unlikely you'll save much time.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

PyTorch自动求导可以计算神经网络中仅关于一个参数的梯度吗？

问题

答案1

如何在同一图中为两个模型制作漂亮的ROC曲线？

Sklearn SequentialFeatureSelector：“Pipeline 应该是一个分类器”，当使用分类器时

将任务协方差矩阵设置为GPyTorch中的相关矩阵

Format error Exception raised from _load_for_mobile while loading a pytorch model in react native

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论