“在调用torch中的backward()函数时,对aten::linear_backward的导数未实现。”

huangapple go评论47阅读模式
英文:

"derivative for aten::linear_backward is not implemented" when calling backward() on mps in torch

问题

我正在开发一个生成声音的GAN。我从wavegan-pytorch的GitHub上复制了大部分代码。我使用搭载M2核心的MacBook,所以我想将处理从CPU转移到带有MPS的GPU。但是当我在损失上调用torch.Tensor.backward()时,我遇到一个错误,说linear_backward未实现。我对编程仍然很陌生,是否有一个我忽视的简单错误,或者在GPU上运行代码是不可能的?这是我的代码:

real_signal = next(self.train_loader)

# 需要添加混合信号和标志
noise = sample_noise(batch_size * generator_batch_size_factor)
generated = self.generator(noise)
#############################
# 计算鉴别器损失并更新鉴别器
#############################
self.apply_zero_grad()
disc_cost, disc_wd = self.calculate_discriminator_loss(
    real_signal.data, generated.data
)
assert not (torch.isnan(disc_cost))
disc_cost.backward()
self.optimizer_d.step()

非常感谢帮助。如果需要更多信息,请告诉我。如果有一个我不明白的简单解决方案,我提前道歉,因为我对这方面还很新。calculate_discriminator_loss()函数的代码如下:

def calculate_discriminator_loss(self, real, generated):
    disc_out_gen = self.discriminator(generated)
    disc_out_real = self.discriminator(real)

    alpha = torch.FloatTensor(batch_size * 2, 1, 1).uniform_(0, 1).to(device)
    alpha = alpha.expand(batch_size * 2, real.size(1), real.size(2))

    interpolated = (1 - alpha) * real.data + (alpha) * generated.data[:batch_size * 2]
    interpolated = Variable(interpolated, requires_grad=True)

    # 计算插值样本的概率
    prob_interpolated = self.discriminator(interpolated)
    grad_inputs = interpolated
    ones = torch.ones(prob_interpolated.size()).to(device)
    gradients = grad(
        outputs=prob_interpolated,
        inputs=grad_inputs,
        grad_outputs=ones,
        create_graph=True,
        retain_graph=True,
        only_inputs=True,
    )[0]
    # 计算梯度惩罚
    grad_penalty = (
        p_coeff
        * ((gradients.view(gradients.size(0), -1).norm(2, dim=1) - 1) ** 2).mean()
    )
    assert not (torch.isnan(grad_penalty))
    assert not (torch.isnan(disc_out_gen.mean()))
    assert not (torch.isnan(disc_out_real.mean()))
    cost_wd = disc_out_gen.mean() - disc_out_real.mean()
    cost = cost_wd + grad_penalty
    return cost, cost_wd
英文:

I'm working on a GAN to generate sounds. I copied most of the code from the wavegan-pytorch github. I'm working on a MacBook with M2 core, so I wanted to shift the processing from cpu to gpu with mps. But when I call torch.Tensor.backward() on my loss I get an Error, that linear_backward is not implemented. I'm still pretty new to programming, is there a simple mistake, that I overlook, or is it just not possible to run the code on gpu? Here's my code:

real_signal = next(self.train_loader)

# need to add mixed signal and flag
noise = sample_noise(batch_size * generator_batch_size_factor)
generated = self.generator(noise)
#############################
# Calculating discriminator loss and updating discriminator
#############################
self.apply_zero_grad()
disc_cost, disc_wd = self.calculate_discriminator_loss(
    real_signal.data, generated.data
)
assert not (torch.isnan(disc_cost))
disc_cost.backward()
self.optimizer_d.step()

would be very glad for help. Let me know, if you need more info, I'm sorry in advance, if there's a simple solution, that I don't get, because I'm new to this.

Here is the code for the calculate_discriminator_loss() function:

def calculate_discriminator_loss(self, real, generated):
    disc_out_gen = self.discriminator(generated)
    disc_out_real = self.discriminator(real)

    alpha = torch.FloatTensor(batch_size * 2, 1, 1).uniform_(0, 1).to(device)
    alpha = alpha.expand(batch_size * 2, real.size(1), real.size(2))

    interpolated = (1 - alpha) * real.data + (alpha) * generated.data[:batch_size * 2]
    interpolated = Variable(interpolated, requires_grad=True)

    # calculate probability of interpolated examples
    prob_interpolated = self.discriminator(interpolated)
    grad_inputs = interpolated
    ones = torch.ones(prob_interpolated.size()).to(device)
    gradients = grad(
        outputs=prob_interpolated,
        inputs=grad_inputs,
        grad_outputs=ones,
        create_graph=True,
        retain_graph=True,
        only_inputs=True,
    )[0]
    # calculate gradient penalty
    grad_penalty = (
        p_coeff
        * ((gradients.view(gradients.size(0), -1).norm(2, dim=1) - 1) ** 2).mean()
    )
    assert not (torch.isnan(grad_penalty))
    assert not (torch.isnan(disc_out_gen.mean()))
    assert not (torch.isnan(disc_out_real.mean()))
    cost_wd = disc_out_gen.mean() - disc_out_real.mean()
    cost = cost_wd + grad_penalty
    return cost, cost_wd

答案1

得分: 1

看到你正在实现WGAN-GP中的鉴别器损失计算,我想看看哪里出了问题,并改进你的代码。

首先,你做得非常好,只是有些地方略有瑕疵。问题确实出在calculate_discriminator_loss函数中。改进的地方有:

  1. Variable在最新版本的PyTorch中已被弃用。建议不要使用它,因为它不再受支持。
  2. 你可以在不访问data属性的情况下索引generatedreal张量,像这样:generated[:batch_size * 2]
  3. 我不确定你在batch_size * 2处想要做什么。生成的批次是否比真实的数据批次大?我建议保持它们相同的大小。
  4. PyTorch有一个在这里非常有用的expand_as函数(而不是expand,然后定义某个张量的大小)。
  5. 在计算梯度时,不需要retain_graph=True,因为你不会两次计算梯度。
  6. 在计算梯度时,不需要only_inputs=True。它已被弃用,而默认设置是True。
  7. p_coeffdevice不是函数中定义的变量。确保在类中定义它们,然后通过self.p_coeffself.device访问它们。

在我运行时,以下内容有效:

def calculate_discriminator_loss(self, real, generated):
    disc_out_gen = self.discriminator(generated)
    disc_out_real = self.discriminator(real)

    alpha = torch.rand(self.batch_size, 1).to(self.device)
    alpha = alpha.expand_as(real)

    interpolated = (1 - alpha) * real + alpha * generated

    prob_interpolated = self.discriminator(interpolated)
    ones = torch.ones(prob_interpolated.size()).to(self.device)
    gradients = grad(
        outputs=prob_interpolated,
        inputs=interpolated,
        grad_outputs=ones,
        create_graph=True)[0]

    grad_penalty = (
        torch.mean((gradients.view(gradients.size(0), -1).norm(2, dim=1) - 1) ** 2)
    )

    cost_wd = disc_out_gen.mean() - disc_out_real.mean()
    cost = cost_wd + grad_penalty
    return cost, cost_wd

还清理了一下你的代码,使其更易读,并删除了断言。

希望对你有帮助。

英文:

Seeing you are implementing the discriminator loss calculation from a WGAN-GP, I thought I'd work out what was going wrong and improve what you have.

First, you were doing absolutely great, with some slight flaws here and there. The problem is indeed in the calculate_discriminator_loss function. Things to improve:

  1. Variable is deprecated in the most recent version of Pytorch. I'd recommend not to use it because it is unsupported.
  2. You can index the generated and real tensors without accessing the data attribute, like so: `generated[:batch_size * 2]
  3. I am not sure what you are trying to do with the batch_size * 2. Is the generated batch bigger than the real batch of data? I would advise to keep them the same size.
  4. PyTorch has an expand_as function which is really useful here (instead of expand and then to define the size of some tensor).
  5. When computing the gradients, you do not need retain_graph=True as you don't compute gradients twice.
  6. When computing gradients, you do not need only_inputs=True. It is deprecated, and the default setting is True.
  7. p_coeff and device are not variables defined in the function. Make sure to define them in the class, and then access them through self.p_coeff and self.device.

The following works when I run it:

def calculate_discriminator_loss(self, real, generated):
    assert real.shape == generated.shape
    disc_out_gen = self.discriminator(generated)
    disc_out_real = self.discriminator(real)

    alpha = torch.rand(self.batch_size, 1).to(self.device)
    alpha = alpha.expand_as(real)

    interpolated = (1 - alpha) * real + alpha * generated

    # calculate probability of interpolated examples
    prob_interpolated = self.discriminator(interpolated)
    ones = torch.ones(prob_interpolated.size()).to(self.device)
    gradients = grad(
        outputs=prob_interpolated,
        inputs=interpolated,
        grad_outputs=ones,
        create_graph=True)[0]

    # calculate gradient penalty
    grad_penalty = (
        torch.mean((gradients.view(gradients.size(0), -1).norm(2, dim=1) - 1) ** 2)
    )

    cost_wd = disc_out_gen.mean() - disc_out_real.mean()
    cost = cost_wd + grad_penalty
    return cost, cost_wd

Cleaned up your code a bit as well to be more readable and removed the asserts.

Hope this helps.

huangapple
  • 本文由 发表于 2023年3月31日 18:10:41
  • 转载请务必保留本文链接:https://go.coder-hub.com/75897313.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定