如何将tensorflow.keras模型移到GPU

huangapple go评论75阅读模式
英文:

How to move a tensorflow.keras model to GPU

问题

I would like

英文:

Let's say I have a keras model like this:

with tf.device("/CPU"):
    model = tf.keras.Sequential([
    # Adds a densely-connected layer with 64 units to the model:
    tf.keras.layers.Dense(64, activation='relu', input_shape=(32,)),
    # Add another:
    tf.keras.layers.Dense(64, activation='relu'),
    # Add a softmax layer with 10 output units:
    tf.keras.layers.Dense(10, activation='softmax')])

I would like to move this model to GPU.

I tried doing this:

with tf.device("/GPU:0"):
    gpu_model = tf.keras.models.clone_model(model)

But the problem with this is that, the variable names change. For example:

The first layer's weight's name of model is: Got from model.layers[0].weights[0].name

> 'dense/kernel:0'

But the first layer's weight's name of gpu_model is: Got from gpu_model.layers[0].weights[0].name

> 'dense_3/kernel:0'

How can I do this GPU transformation while also preserving the names of the variables?

I don't want to save the model to disk and load again

答案1

得分: 0

I am answering my own question. If someone has a better solution. Kindly post it.

This is a work around I found:

  1. Create a state_dict like PyTorch.
  2. Get the model architecture as JSON.
  3. Clear the Keras session and delete the model instance.
  4. Create a new model from the JSON within tf.device context.
  5. Load the previous weights from state_dict.
state_dict = {}
for layer in model.layers:
    for weight in layer.weights:
        state_dict[weight.name] = weight.numpy()

model_json_config = model.to_json()
tf.keras.backend.clear_session() # this is crucial to get previous names again
del model

with tf.device("/GPU:0"):
    new_model = tf.keras.models.model_from_json(model_json_config)

for layer in new_model.layers:
    current_layer_weights = []
    for weight in layer.weights:
        current_layer_weights.append(state_dict[weight.name])
    layer.set_weights(current_layer_weights)
英文:

I am answering my own question. If someone has a better solution. Kindly post it

This is a work around I found:

  1. Create a state_dict like PyTorch
  2. Get the model architecture as JSON
  3. Clear the Keras session and delete the model instance
  4. Create a new model from the JSON within tf.device context
  5. Load the previous weights from state_dict
state_dict = {}
for layer in model.layers:
    for weight in layer.weights:
        state_dict[weight.name] = weight.numpy()

model_json_config = model.to_json()
tf.keras.backend.clear_session() # this is crucial to get previous names again
del model

with tf.device("/GPU:0"):
    new_model = tf.keras.models.model_from_json(model_json_config)

for layer in new_model.layers:
    current_layer_weights = []
    for weight in layer.weights:
        current_layer_weights.append(state_dict[weight.name])
    layer.set_weights(current_layer_weights)

huangapple
  • 本文由 发表于 2020年1月7日 01:58:48
  • 转载请务必保留本文链接:https://go.coder-hub.com/59616788.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定