英文:
Pytorch Conv1D produces more channels than expected
问题
我有以下神经网络:
class Discriminator(nn.Module):
def __init__(self, input_size, output_size):
super(Discriminator, self).__init__()
self.main = nn.Sequential(
nn.Conv1d(1, 3000, 1),
nn.LeakyReLU(0.2),
nn.Conv1d(3000, 1, 1),
nn.Sigmoid()
)
def forward(self, x):
return self.main(x.float())
Discriminator = Discriminator()
然后我按照以下方式训练这个判别器:
for d_index in range(d_steps):
Discriminator.zero_grad()
prediction = Discriminator(d_real_data).view(-1)
real_data的形状是[60, 1, 3000],其中:
- 批量大小:60
- 通道数量:1
- 序列长度:3000
这个输入遵循了Pytorch Conv1D的文档。我期望的是prediction
的形状为[60, 1, 1],因此使用.view(-1)将其展平为包含60个值的一维数组。然而,实际上我得到的prediction
的形状是[60, 1, 3000],将其展平后得到包含180000个值的一维数组。为什么使用Sigmoid的Conv1D返回的输出形状与我的输入形状相同?
英文:
I have the following neural network:
class Discriminator(nn.Module):
def __init__(self, input_size, output_size):
super(Discriminator, self).__init__()
self.main = nn.Sequential(
nn.Conv1d(1, 3000, 1),
nn.LeakyReLU(0.2),
nn.Conv1d(3000, 1, 1),
nn.Sigmoid()
)
def forward(self, x):
return self.main(x.float())
Discriminator = Discriminator()
Then I train this discriminator as follows:
for d_index in range(d_steps):
Discriminator.zero_grad()
prediction = Discriminator(d_real_data).view(-1)
The shape of real_data is [60, 1, 3000] where
- batch size: 60
- number of channels: 1
- sequence length: 3000
This input follows the documentation for Pytorch Conv1D. What I'm expecting is prediction
to be [60, 1, 1] and so flattening it with .view(-1) shapes it into a 1D array with 60 values. However, what I actually end up getting for the shape of prediction
is [60, 1, 3000] and flattening it gives me a 1D array with 180000 values. Why is Conv1D with Sigmoid returning an output with the same shape as my input?
答案1
得分: 1
Conv1d的输入是in_channels、out_channels和kernelsize。
in_channels是您的数据的深度。在一张RGB彩色图片中,它将是3。out_channels是输出的深度,而不是序列的长度。您正在将一个输入通道投影到3000个输出通道。您还应该增加内核大小
英文:
The input for Conv1d is in_channels, out_channels, kernelsize.
in_channels is the depth of your data. In a colour picture with RBG it would be 3. out_channels is the depth of the output and not the length of the sequence. You are projecting your one input channel into 3000 output channels. You should also increase the kernel size
答案2
得分: 1
第一个Conv1D的输出将为(60,3000,3000),因为它的卷积核尺寸为1,所以不会改变长度。
第二个Conv1D的输出将为(60,1,3000),同样,卷积核尺寸为1,所以不会改变序列长度。
但每次您都指定通道数为3000 / 1。
Sigmoid函数是逐元素函数,因此不会改变大小。
正如Kilian所说,您需要增加卷积核尺寸(和/或步幅)以改变第三维的大小。
英文:
The first Conv1D will have an Ouput of (60,3000,3000) as it has the kernel size one, so it doesn't change the length.
The secound Conv1D will have an Output of (60,1,3000) as again, the kernel size is one, so it won't change the sequence length.
Yet each time you specifiy the amout of channels to be 3000 / 1.
The sigmoid function is an element wise function, so it doen't change the size at all.
As Kilian said, you need to increase the kernel size (and/or stride) to change the size of the third dimension.
答案3
得分: 1
你在鉴别器网络中使用的卷积层似乎不正确。你使用了nn.Conv1d
并且核大小为1,这会导致每个输出元素都独立计算,而不考虑任何邻域信息。尝试更新你的代码如下:
class Discriminator(nn.Module):
def __init__(self, input_size, output_size):
super(Discriminator, self).__init__()
self.main = nn.Sequential(
nn.Conv1d(1, 64, kernel_size=3, stride=3),
nn.LeakyReLU(0.2),
nn.Conv1d(64, 1, kernel_size=3, stride=3),
nn.Sigmoid()
)
def forward(self, x):
return self.main(x.float())
discriminator = Discriminator(input_size=3000, output_size=1)
英文:
Your usage of the convolutional layers in your discriminator network seems incorrect. you're using nn.Conv1d
with kernel size 1 which results in each output element being computed independently, without taking into account any neighborhood information. Try updating your code into:
class Discriminator(nn.Module):
def __init__(self, input_size, output_size):
super(Discriminator, self).__init__()
self.main = nn.Sequential(
nn.Conv1d(1, 64, kernel_size=3, stride=3),
nn.LeakyReLU(0.2),
nn.Conv1d(64, 1, kernel_size=3, stride=3),
nn.Sigmoid()
)
def forward(self, x):
return self.main(x.float())
discriminator = Discriminator(input_size=3000, output_size=1)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论