Why are the weights not updating when splitting the model into two `class` in pytorch and torch-geometric?

huangapple go评论102阅读模式
英文:

Why are the weights not updating when splitting the model into two `class` in pytorch and torch-geometric?

问题

我尝试了两种不同的方法来构建我的模型:

  • 第一种方法:将模型分成两个class,一个是MainModel(),另一个是GinEncoder(),当我调用MainModel()时,它也会调用GinEncoder()
  • 第二种方法:创建一个单一的类:MainModel2(),通过合并两个类:MainModel()GinEncoder()

因此,MainModel2()的模型层结构与『MainModel() + GinEncoder()』相同,但是:

  • 在第一种方法中,GinEncoder()的权重无法更新,而MainModel()的权重可以更新。
  • 在第二种方法中,MainModel2()的所有权重都可以更新。

我的问题是:
为什么在将模型拆分成两个class时,在pytorch和torch-geometric中权重没有更新?但是当我合并这两个类的层时,所有的权重都可以更新?

以下是部分代码:

  • 第一种方法:将模型拆分为两个class,一个是MainModel,另一个是GinEncoder,如下所示:
class GinEncoder(torch.nn.Module):
    ...

class MainModel(torch.nn.Module):
    ...
  • 第二种方法:通过合并两个类的层创建MainModel2()
class MainModel2(torch.nn.Module):
    ...

完整的代码和训练数据可以在以下链接找到:

英文:

I tried two different ways to build my model:

  • First approach: split the model into two class, one is MainModel() and the other is GinEncoder(), and when I call the MainModel(), it would also call GinEncoder() too.
  • Second approach: Create a single class: MainModel2() by merging the two classes: MainModel() and GinEncoder().

So the model layer structure of MainModel2() are as same as 『MainModel() + GinEncoder()』, but:

  • In the first approach, *the weights of GinEncoder() cannot be updated, while the weights of MainModel() can be updated.
  • In the second approach, all weights of MainModel2() can be updated

My question is:
Why are the weights not updating when splitting the model into two class in pytorch and torch-geometric? But when I merge the layers of these two classes, all weight can be updated?

Here are partial codes:

  • First approach: split the model to two class, one is MainModel, the other GinEncoder, as shown as below:

    class GinEncoder(torch.nn.Module):
        def __init__(self):
            super(GinEncoder, self).__init__()
            self.gin_convs = torch.nn.ModuleList()
            self.gin_convs.append(GINConv(Sequential(Linear(1, dim_h), ReLU(),
                                                     Linear(dim_h, dim_h), ReLU(),
                                                     BatchNorm1d(dim_h))))
            for _ in range(gin_layer-1):
                self.gin_convs.append(GINConv(Sequential(Linear(dim_h, dim_h), ReLU(),
                                                         Linear(dim_h, dim_h), ReLU(),
                                                         BatchNorm1d(dim_h))))
    
    
        def forward(self, x, edge_index, batch_node_id):
            # Node embeddings
            nodes_emb_layers = []
            for i in range(gin_layer):
                x = self.gin_convs[i](x, edge_index)
                nodes_emb_layers.append(x)
    
            # Graph-level readout
            nodes_emb_pools = [global_add_pool(nodes_emb, batch_node_id) for nodes_emb in nodes_emb_layers]
    
            # Concatenate and form the graph embeddings
            graph_embeds = torch.cat(nodes_emb_pools, dim=1)
            return graph_embeds
    
    
    class MainModel(torch.nn.Module):
        def __init__(self, graph_encoder:torch.nn.Module):
            super(MainModel, self).__init__()
            self.graph_encoder = graph_encoder
            self.lin1 = Linear(dim_h*gin_layer, 4)
            self.lin2 = Linear(4, dim_h*gin_layer)
    
    
        def forward(self, x, edge_index, batch_node_id):
            graph_embeds = self.graph_encoder(x, edge_index, batch_node_id)
            out_lin1 = self.lin1(graph_embeds)
            pred = self.lin2(out_lin1)[-1]
    
            return pred
    
  • Second approach: create MainModel2() by merging layers of the two class: MainModel() and GinEncoder()

    class MainModel2(torch.nn.Module):
        def __init__(self):
            super(MainModel2, self).__init__()
            self.gin_convs = torch.nn.ModuleList()
            self.gin_convs.append(GINConv(Sequential(Linear(1, dim_h), ReLU(),
                                                     Linear(dim_h, dim_h), ReLU(),
                                                     BatchNorm1d(dim_h))))
            self.gin_convs.append(GINConv(Sequential(Linear(dim_h, dim_h), ReLU(),
                                                     Linear(dim_h, dim_h), ReLU(),
                                                     BatchNorm1d(dim_h))))
            self.lin1 = Linear(dim_h*gin_layer, 4)
            self.lin2 = Linear(4, dim_h*gin_layer)
    
    
        def forward(self, x, edge_index, batch_node_id):
            # Node embeddings
            nodes_emb_layers = []
            for i in range(2):
                x = self.gin_convs[i](x, edge_index)
                nodes_emb_layers.append(x)
    
            # Graph-level readout
            nodes_emb_pools = [global_add_pool(nodes_emb, batch_node_id) for nodes_emb in nodes_emb_layers]
    
            # Concatenate and form the graph embeddings
            graph_embeds = torch.cat(nodes_emb_pools, dim=1)
            out_lin1 = self.lin1(graph_embeds)
            pred = self.lin2(out_lin1)[-1]
    
            return pred
    

PS.

答案1

得分: 1

我检查了附加的代码。看起来你只将 model 的参数包含在优化器中。

确保将两个模型的权重都输入到优化器中。例如,在你的情况下:

gin_encoder = GinEncoder().to("cuda")
model =  MainModel(gin_encoder).to("cuda")

opt_enc = torch.optim.Adam(gin_encoder.parameters())
opt_model = torch.optim.Adam(model.parameters())

此外,请确保在训练期间同时运行两个优化器,即:

opt_enc.zero_grad()
opt_model.zero_grad()

loss.backward()

opt_enc.step()
opt_model.step()

或者,你可以组成一个包含两个模型参数的列表,然后将其输入到单个优化器中。

opt_merge = torch.optim.Adam(list(model.parameters()) + list(gin_encoder.parameters()))
英文:

I check the attached code. It seems that you only inclue the parameters of the model into the optimizer.

Make sure you input weights of both models to the optimizers. In your case, for example

gin_encoder = GinEncoder().to("cuda")
model =  MainModel(gin_encoder).to("cuda")

opt_enc = torch.optim.Adam(gin_encoder.parameters())
opt_model = torch.optim.Adam(model .parameters())

In addtion, make sure you run both optimizers during training, i.e.,

opt_enc.zero_grad()
opt_model.zero_grad()

loss.backward()

opt_enc.step()
opt_model.step()

Alternatively, you can compose a list that contains the parameters of both models and input it to a single optimizer.

opt_merge = torch.optim.Adam(list(model.parameters())+list(gin_encoder.parameters()))

huangapple
  • 本文由 发表于 2023年2月19日 00:47:41
  • 转载请务必保留本文链接:https://go.coder-hub.com/75494833.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定