问题

I would like

英文:

Let's say I have a keras model like this:

with tf.device(&quot;/CPU&quot;):
    model = tf.keras.Sequential([
    # Adds a densely-connected layer with 64 units to the model:
    tf.keras.layers.Dense(64, activation=&#39;relu&#39;, input_shape=(32,)),
    # Add another:
    tf.keras.layers.Dense(64, activation=&#39;relu&#39;),
    # Add a softmax layer with 10 output units:
    tf.keras.layers.Dense(10, activation=&#39;softmax&#39;)])

I would like to move this model to GPU.

I tried doing this:

with tf.device(&quot;/GPU:0&quot;):
    gpu_model = tf.keras.models.clone_model(model)

But the problem with this is that, the variable names change. For example:

The first layer's weight's name of model is: Got from model.layers[0].weights[0].name

> 'dense/kernel:0'

But the first layer's weight's name of gpu_model is: Got from gpu_model.layers[0].weights[0].name

> 'dense_3/kernel:0'

How can I do this GPU transformation while also preserving the names of the variables?

I don't want to save the model to disk and load again

答案1

得分: 0

I am answering my own question. If someone has a better solution. Kindly post it.

This is a work around I found:

Create a state_dict like PyTorch.
Get the model architecture as JSON.
Clear the Keras session and delete the model instance.
Create a new model from the JSON within tf.device context.
Load the previous weights from state_dict.

state_dict = {}
for layer in model.layers:
    for weight in layer.weights:
        state_dict[weight.name] = weight.numpy()

model_json_config = model.to_json()
tf.keras.backend.clear_session() # this is crucial to get previous names again
del model

with tf.device("/GPU:0"):
    new_model = tf.keras.models.model_from_json(model_json_config)

for layer in new_model.layers:
    current_layer_weights = []
    for weight in layer.weights:
        current_layer_weights.append(state_dict[weight.name])
    layer.set_weights(current_layer_weights)

英文:

I am answering my own question. If someone has a better solution. Kindly post it

This is a work around I found:

Create a state_dict like PyTorch
Get the model architecture as JSON
Clear the Keras session and delete the model instance
Create a new model from the JSON within tf.device context
Load the previous weights from state_dict

state_dict = {}
for layer in model.layers:
    for weight in layer.weights:
        state_dict[weight.name] = weight.numpy()

model_json_config = model.to_json()
tf.keras.backend.clear_session() # this is crucial to get previous names again
del model

with tf.device(&quot;/GPU:0&quot;):
    new_model = tf.keras.models.model_from_json(model_json_config)

for layer in new_model.layers:
    current_layer_weights = []
    for weight in layer.weights:
        current_layer_weights.append(state_dict[weight.name])
    layer.set_weights(current_layer_weights)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

如何将tensorflow.keras模型移到GPU

问题

答案1

pytest与@cache结合使用时无法按预期工作。

条形图竞赛轴格式化使用Python

TensorFlow简单的累积积和产品循环神经网络单元

Python将字符串（从int64）转换为日期时间

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论