2023年6月22日 13:56:34go评论105阅读模式

英文:

Pytorch Conv1D produces more channels than expected

问题

我有以下神经网络：

class Discriminator(nn.Module):
    def __init__(self, input_size, output_size):
        super(Discriminator, self).__init__()
        self.main = nn.Sequential(
            nn.Conv1d(1, 3000, 1),
            nn.LeakyReLU(0.2),
            nn.Conv1d(3000, 1, 1),
            nn.Sigmoid()
        )
    def forward(self, x):
        return self.main(x.float())
Discriminator = Discriminator()

然后我按照以下方式训练这个判别器：

for d_index in range(d_steps):
    Discriminator.zero_grad()
    prediction = Discriminator(d_real_data).view(-1)

real_data的形状是[60, 1, 3000]，其中：

批量大小：60
通道数量：1
序列长度：3000

这个输入遵循了Pytorch Conv1D的文档。我期望的是prediction的形状为[60, 1, 1]，因此使用.view(-1)将其展平为包含60个值的一维数组。然而，实际上我得到的prediction的形状是[60, 1, 3000]，将其展平后得到包含180000个值的一维数组。为什么使用Sigmoid的Conv1D返回的输出形状与我的输入形状相同？

英文:

I have the following neural network:

class Discriminator(nn.Module):
    def __init__(self, input_size, output_size):
        super(Discriminator, self).__init__()
        self.main = nn.Sequential(
            nn.Conv1d(1, 3000, 1),
            nn.LeakyReLU(0.2),
            nn.Conv1d(3000, 1, 1),
            nn.Sigmoid()
        )
    def forward(self, x):
        return self.main(x.float())
Discriminator = Discriminator()

Then I train this discriminator as follows:

        for d_index in range(d_steps):
            Discriminator.zero_grad()
            prediction = Discriminator(d_real_data).view(-1)

The shape of real_data is [60, 1, 3000] where

batch size: 60
number of channels: 1
sequence length: 3000

This input follows the documentation for Pytorch Conv1D. What I'm expecting is prediction to be [60, 1, 1] and so flattening it with .view(-1) shapes it into a 1D array with 60 values. However, what I actually end up getting for the shape of prediction is [60, 1, 3000] and flattening it gives me a 1D array with 180000 values. Why is Conv1D with Sigmoid returning an output with the same shape as my input?

答案1

得分: 1

Conv1d的输入是in_channels、out_channels和kernelsize。
in_channels是您的数据的深度。在一张RGB彩色图片中，它将是3。out_channels是输出的深度，而不是序列的长度。您正在将一个输入通道投影到3000个输出通道。您还应该增加内核大小

英文:

The input for Conv1d is in_channels, out_channels, kernelsize.
in_channels is the depth of your data. In a colour picture with RBG it would be 3. out_channels is the depth of the output and not the length of the sequence. You are projecting your one input channel into 3000 output channels. You should also increase the kernel size

答案2

得分: 1

第一个Conv1D的输出将为(60,3000,3000)，因为它的卷积核尺寸为1，所以不会改变长度。
第二个Conv1D的输出将为(60,1,3000)，同样，卷积核尺寸为1，所以不会改变序列长度。
但每次您都指定通道数为3000 / 1。
Sigmoid函数是逐元素函数，因此不会改变大小。
正如Kilian所说，您需要增加卷积核尺寸（和/或步幅）以改变第三维的大小。

英文:

The first Conv1D will have an Ouput of (60,3000,3000) as it has the kernel size one, so it doesn't change the length.
The secound Conv1D will have an Output of (60,1,3000) as again, the kernel size is one, so it won't change the sequence length.
Yet each time you specifiy the amout of channels to be 3000 / 1.
The sigmoid function is an element wise function, so it doen't change the size at all.
As Kilian said, you need to increase the kernel size (and/or stride) to change the size of the third dimension.

答案3

得分: 1

你在鉴别器网络中使用的卷积层似乎不正确。你使用了nn.Conv1d并且核大小为1，这会导致每个输出元素都独立计算，而不考虑任何邻域信息。尝试更新你的代码如下：

class Discriminator(nn.Module):
    def __init__(self, input_size, output_size):
        super(Discriminator, self).__init__()
        self.main = nn.Sequential(
            nn.Conv1d(1, 64, kernel_size=3, stride=3),
            nn.LeakyReLU(0.2),
            nn.Conv1d(64, 1, kernel_size=3, stride=3),
            nn.Sigmoid()
        )
    def forward(self, x):
        return self.main(x.float())
discriminator = Discriminator(input_size=3000, output_size=1)

英文:

Your usage of the convolutional layers in your discriminator network seems incorrect. you're using nn.Conv1d with kernel size 1 which results in each output element being computed independently, without taking into account any neighborhood information. Try updating your code into:

class Discriminator(nn.Module):
    def __init__(self, input_size, output_size):
        super(Discriminator, self).__init__()
        self.main = nn.Sequential(
            nn.Conv1d(1, 64, kernel_size=3, stride=3),
            nn.LeakyReLU(0.2),
            nn.Conv1d(64, 1, kernel_size=3, stride=3),
            nn.Sigmoid()
        )
    def forward(self, x):
        return self.main(x.float())
discriminator = Discriminator(input_size=3000, output_size=1)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

PyTorch Conv1D生成的通道数超出了预期。

问题

答案1

答案2

答案3

Pydroid PIL在安卓上无法显示图像。

将新数据从CSV插入到已存在的PostgreSQL数据库。

Python替换嵌套的for循环。

Matplotlib的每周柱状图在宽度小于1.0时太细，在宽度大于等于1.0时太粗。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

发表评论