英文:
How to move a tensorflow.keras model to GPU
问题
I would like
英文:
Let's say I have a keras model like this:
with tf.device("/CPU"):
model = tf.keras.Sequential([
# Adds a densely-connected layer with 64 units to the model:
tf.keras.layers.Dense(64, activation='relu', input_shape=(32,)),
# Add another:
tf.keras.layers.Dense(64, activation='relu'),
# Add a softmax layer with 10 output units:
tf.keras.layers.Dense(10, activation='softmax')])
I would like to move this model to GPU.
I tried doing this:
with tf.device("/GPU:0"):
gpu_model = tf.keras.models.clone_model(model)
But the problem with this is that, the variable names change. For example:
The first layer's weight's name of model
is: Got from model.layers[0].weights[0].name
> 'dense/kernel:0'
But the first layer's weight's name of gpu_model
is: Got from gpu_model.layers[0].weights[0].name
> 'dense_3/kernel:0'
How can I do this GPU transformation while also preserving the names of the variables?
I don't want to save the model to disk and load again
答案1
得分: 0
I am answering my own question. If someone has a better solution. Kindly post it.
This is a work around I found:
- Create a state_dict like PyTorch.
- Get the model architecture as JSON.
- Clear the Keras session and delete the model instance.
- Create a new model from the JSON within
tf.device
context. - Load the previous weights from state_dict.
state_dict = {}
for layer in model.layers:
for weight in layer.weights:
state_dict[weight.name] = weight.numpy()
model_json_config = model.to_json()
tf.keras.backend.clear_session() # this is crucial to get previous names again
del model
with tf.device("/GPU:0"):
new_model = tf.keras.models.model_from_json(model_json_config)
for layer in new_model.layers:
current_layer_weights = []
for weight in layer.weights:
current_layer_weights.append(state_dict[weight.name])
layer.set_weights(current_layer_weights)
英文:
I am answering my own question. If someone has a better solution. Kindly post it
This is a work around I found:
- Create a state_dict like PyTorch
- Get the model architecture as JSON
- Clear the Keras session and delete the model instance
- Create a new model from the JSON within
tf.device
context - Load the previous weights from state_dict
state_dict = {}
for layer in model.layers:
for weight in layer.weights:
state_dict[weight.name] = weight.numpy()
model_json_config = model.to_json()
tf.keras.backend.clear_session() # this is crucial to get previous names again
del model
with tf.device("/GPU:0"):
new_model = tf.keras.models.model_from_json(model_json_config)
for layer in new_model.layers:
current_layer_weights = []
for weight in layer.weights:
current_layer_weights.append(state_dict[weight.name])
layer.set_weights(current_layer_weights)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论