英文:
Why are the weights not updating when splitting the model into two `class` in pytorch and torch-geometric?
问题
我尝试了两种不同的方法来构建我的模型:
- 第一种方法:将模型分成两个
class
,一个是MainModel()
,另一个是GinEncoder()
,当我调用MainModel()
时,它也会调用GinEncoder()
。 - 第二种方法:创建一个单一的类:
MainModel2()
,通过合并两个类:MainModel()
和GinEncoder()
。
因此,MainModel2()
的模型层结构与『MainModel()
+ GinEncoder()
』相同,但是:
- 在第一种方法中,
GinEncoder()
的权重无法更新,而MainModel()
的权重可以更新。 - 在第二种方法中,
MainModel2()
的所有权重都可以更新。
我的问题是:
为什么在将模型拆分成两个class
时,在pytorch和torch-geometric中权重没有更新?但是当我合并这两个类的层时,所有的权重都可以更新?
以下是部分代码:
- 第一种方法:将模型拆分为两个
class
,一个是MainModel
,另一个是GinEncoder
,如下所示:
class GinEncoder(torch.nn.Module):
...
class MainModel(torch.nn.Module):
...
- 第二种方法:通过合并两个类的层创建
MainModel2()
:
class MainModel2(torch.nn.Module):
...
完整的代码和训练数据可以在以下链接找到:
- 代码:https://gist.github.com/theabc50111/8a38b88713f494be1d92d4ea2bdecc5e
- 训练数据:https://drive.google.com/drive/folders/1_KMwCzf1diwS4gGNdSSxG7bnemqQkFxI?usp=sharing
- 我在StackOverflow上提了一个相关问题:https://stackoverflow.com/questions/75444625/how-to-update-the-weights-of-a-composite-model-composed-of-pytorch-and-torch-geo
英文:
I tried two different ways to build my model:
- First approach: split the model into two
class
, one isMainModel()
and the other isGinEncoder()
, and when I call theMainModel()
, it would also callGinEncoder()
too. - Second approach: Create a single class:
MainModel2()
by merging the two classes:MainModel()
andGinEncoder()
.
So the model layer structure of MainModel2()
are as same as 『MainModel()
+ GinEncoder()
』, but:
- In the first approach, *the weights of
GinEncoder()
cannot be updated, while the weights ofMainModel()
can be updated. - In the second approach, all weights of
MainModel2()
can be updated
My question is:
Why are the weights not updating when splitting the model into two class
in pytorch and torch-geometric? But when I merge the layers of these two classes, all weight can be updated?
Here are partial codes:
-
First approach: split the model to two
class
, one isMainModel
, the otherGinEncoder
, as shown as below:class GinEncoder(torch.nn.Module): def __init__(self): super(GinEncoder, self).__init__() self.gin_convs = torch.nn.ModuleList() self.gin_convs.append(GINConv(Sequential(Linear(1, dim_h), ReLU(), Linear(dim_h, dim_h), ReLU(), BatchNorm1d(dim_h)))) for _ in range(gin_layer-1): self.gin_convs.append(GINConv(Sequential(Linear(dim_h, dim_h), ReLU(), Linear(dim_h, dim_h), ReLU(), BatchNorm1d(dim_h)))) def forward(self, x, edge_index, batch_node_id): # Node embeddings nodes_emb_layers = [] for i in range(gin_layer): x = self.gin_convs[i](x, edge_index) nodes_emb_layers.append(x) # Graph-level readout nodes_emb_pools = [global_add_pool(nodes_emb, batch_node_id) for nodes_emb in nodes_emb_layers] # Concatenate and form the graph embeddings graph_embeds = torch.cat(nodes_emb_pools, dim=1) return graph_embeds class MainModel(torch.nn.Module): def __init__(self, graph_encoder:torch.nn.Module): super(MainModel, self).__init__() self.graph_encoder = graph_encoder self.lin1 = Linear(dim_h*gin_layer, 4) self.lin2 = Linear(4, dim_h*gin_layer) def forward(self, x, edge_index, batch_node_id): graph_embeds = self.graph_encoder(x, edge_index, batch_node_id) out_lin1 = self.lin1(graph_embeds) pred = self.lin2(out_lin1)[-1] return pred
-
Second approach: create
MainModel2()
by merging layers of the two class:MainModel()
andGinEncoder()
class MainModel2(torch.nn.Module): def __init__(self): super(MainModel2, self).__init__() self.gin_convs = torch.nn.ModuleList() self.gin_convs.append(GINConv(Sequential(Linear(1, dim_h), ReLU(), Linear(dim_h, dim_h), ReLU(), BatchNorm1d(dim_h)))) self.gin_convs.append(GINConv(Sequential(Linear(dim_h, dim_h), ReLU(), Linear(dim_h, dim_h), ReLU(), BatchNorm1d(dim_h)))) self.lin1 = Linear(dim_h*gin_layer, 4) self.lin2 = Linear(4, dim_h*gin_layer) def forward(self, x, edge_index, batch_node_id): # Node embeddings nodes_emb_layers = [] for i in range(2): x = self.gin_convs[i](x, edge_index) nodes_emb_layers.append(x) # Graph-level readout nodes_emb_pools = [global_add_pool(nodes_emb, batch_node_id) for nodes_emb in nodes_emb_layers] # Concatenate and form the graph embeddings graph_embeds = torch.cat(nodes_emb_pools, dim=1) out_lin1 = self.lin1(graph_embeds) pred = self.lin2(out_lin1)[-1] return pred
PS.
- I put the complete codes in here:
https://gist.github.com/theabc50111/8a38b88713f494be1d92d4ea2bdecc5e - I put the training data on Google Drive: https://drive.google.com/drive/folders/1_KMwCzf1diwS4gGNdSSxG7bnemqQkFxI?usp=sharing
- I asked a related question: https://stackoverflow.com/questions/75444625/how-to-update-the-weights-of-a-composite-model-composed-of-pytorch-and-torch-geo
答案1
得分: 1
我检查了附加的代码。看起来你只将 model
的参数包含在优化器中。
确保将两个模型的权重都输入到优化器中。例如,在你的情况下:
gin_encoder = GinEncoder().to("cuda")
model = MainModel(gin_encoder).to("cuda")
opt_enc = torch.optim.Adam(gin_encoder.parameters())
opt_model = torch.optim.Adam(model.parameters())
此外,请确保在训练期间同时运行两个优化器,即:
opt_enc.zero_grad()
opt_model.zero_grad()
loss.backward()
opt_enc.step()
opt_model.step()
或者,你可以组成一个包含两个模型参数的列表,然后将其输入到单个优化器中。
opt_merge = torch.optim.Adam(list(model.parameters()) + list(gin_encoder.parameters()))
英文:
I check the attached code. It seems that you only inclue the parameters of the model
into the optimizer.
Make sure you input weights of both models to the optimizers. In your case, for example
gin_encoder = GinEncoder().to("cuda")
model = MainModel(gin_encoder).to("cuda")
opt_enc = torch.optim.Adam(gin_encoder.parameters())
opt_model = torch.optim.Adam(model .parameters())
In addtion, make sure you run both optimizers during training, i.e.,
opt_enc.zero_grad()
opt_model.zero_grad()
loss.backward()
opt_enc.step()
opt_model.step()
Alternatively, you can compose a list that contains the parameters of both models and input it to a single optimizer.
opt_merge = torch.optim.Adam(list(model.parameters())+list(gin_encoder.parameters()))
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论