需要GPU(cuda)访问在部署模型时

huangapple go评论101阅读模式
英文:

Need GPU (cuda) access while deploying the model

问题

I need assistance with deploying a pre-trained model. I have created a custom score.py file for the deployment process. However, the docker created on the CPU instance does not provide access to the GPU, which poses a problem for predicting with PyTorch or TensorFlow models as they require input to be converted to tensors loaded on the GPU. Can you suggest a solution?

My score.py script -

  1. import something
  2. def init():
  3. global model
  4. model_path = os.path.join(os.getenv("AZUREML_MODEL_DIR"), "use-case1-model")
  5. model = mlflow.pytorch.load_model(model_path, map_location=torch.device('cpu'))
  6. logging.info("Init complete")
  7. tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
  8. def run(data):
  9. json_data = json.loads(data)
  10. title = json_data["input_data"]["title"]
  11. att = json_data["input_data"]["attributes"]
  12. result = {}
  13. for i in range(len(title)):
  14. my_dict = {}
  15. for j in range(len(att)):
  16. attr = att[i][j]
  17. t, a = nobert4token(tokenizer, title[i].lower(), attr)
  18. x = X_padding(t)
  19. y = tag_padding(a)
  20. tensor_a = torch.tensor(y, dtype=torch.int32)
  21. tensor_a = torch.unsqueeze(tensor_a, dim=0).to("cuda")
  22. tensor_t = torch.tensor(x, dtype=torch.int32)
  23. tensor_t = torch.unsqueeze(tensor_t, dim=0).to("cuda")
  24. output = model([tensor_t, tensor_a])
  25. predict_list = output.tolist()[0]
  26. my_dict[attr] = " ".join(words_p)
  27. result[title[i]] = my_dict
  28. return result

My invoke script-

  1. ml_client.online_endpoints.invoke(
  2. endpoint_name=endpoint_result.name,
  3. deployment_name=green_deployment_uc1.name,
  4. request_file=os.path.join("./dependencies", "sample.json"),
  5. )

My conda.yaml-

  1. channels:
  2. - conda-forge
  3. dependencies:
  4. - python=3.8
  5. - pip=22.1.2
  6. - numpy=1.21.2
  7. - scikit-learn=0.24.2
  8. - scipy=1.7.1
  9. - 'pandas>=1.1,<1.2'
  10. - pytorch=1.10.0
  11. - pip:
  12. - 'inference-schema[numpy-support]==1.5.0'
  13. - xlrd==2.0.1
  14. - mlflow==1.26.1
  15. - azureml-mlflow==1.42.0
  16. - tqdm==4.63.0
  17. - pytorch-transformers==1.2.0
  18. - pytorch-lightning==2.0.2
  19. - seqeval==1.2.2
  20. - azureml-inference-server-http==0.8.0
  21. name: model-env

Error that I am getting -

  1. 127.0.0.1 - - [29/May/2023:10:03:32 +0000] "GET / HTTP/1.0" 200 7 "-" "kube-probe/1.18"
  2. 2023-05-29 10:03:34,291 E [70] azmlinfsrv - Encountered Exception: Traceback (most recent call last):
  3. File "/azureml-envs/azureml_d587e0800be72e17d773ddca63762cd1/lib/python3.8/site-packages/azureml_inference_server_http/server/user_script.py", line 130, in invoke_run
  4. run_output = self._wrapped_user_run(**run_parameters, request_headers=dict(request.headers))
  5. File "/azureml-envs/azureml_d587e0800be72e17d773ddca63762cd1/lib/python3.8/site-packages/azureml_inference_server_http/server/user_script.py", line 154, in <lambda>
  6. self._wrapped_user_run = lambda request_headers, **kwargs: self._user_run(**kwargs)
  7. File "/var/azureml-app/dependencies/score.py", line 129, in run
  8. tensor_a = torch.unsqueeze(tensor_a, dim=0).to("cuda")
  9. File "/azureml-envs/azureml_d587e0800be72e17d773ddca63762cd1/lib/python3.8/site-packages/torch/cuda/__init__.py", line 247, in _lazy_init
  10. torch._C._cuda_init()
  11. RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
  12. The above exception was the direct cause of the following exception:

If you think why I used "model = mlflow.pytorch.load_model(model_path, map_location=torch.device('cpu'))"

please refer to this forum- https://learn.microsoft.com/en-us/answers/questions/1291498/facing-problem-while-deploying-model-on-azure-ml-a

Documentation - https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-mlflow-models-online-endpoints?view=azureml-api-2&tabs=sdk

英文:

I need assistance with deploying a pre-trained model. I have created a custom score.py file for the deployment process. However, the docker created on the CPU instance does not provide access to the GPU, which poses a problem for predicting with PyTorch or TensorFlow models as they require input to be converted to tensors loaded on the GPU. Can you suggest a solution?

My score.py script -

  1. import something
  2. # original = torch.load
  3. # def load(*args):
  4. # return torch.load(*args, map_location=torch.device(&quot;cpu&quot;),pickle_module=None)
  5. # def init():
  6. # global model
  7. # model_path = os.path.join(os.getenv(&quot;AZUREML_MODEL_DIR&quot;), &quot;use-case1-model&quot;)
  8. # # &quot;model&quot; is the path of the mlflow artifacts when the model was registered. For automl
  9. # # models, this is generally &quot;mlflow-model&quot;.
  10. # with mock.patch(&quot;torch.load&quot;, load):
  11. # model = mlflow.pyfunc.load_model(model_path)
  12. # logging.info(&quot;Init complete&quot;)
  13. def init():
  14. global model
  15. model_path = os.path.join(os.getenv(&quot;AZUREML_MODEL_DIR&quot;), &quot;use-case1-model&quot;)
  16. model = mlflow.pytorch.load_model(model_path, map_location=torch.device(&#39;cpu&#39;))
  17. logging.info(&quot;Init complete&quot;)
  18. tokenizer = BertTokenizer.from_pretrained(&quot;bert-base-uncased&quot;)
  19. def run(data):
  20. json_data = json.loads(data)
  21. title = json_data[&quot;input_data&quot;][&quot;title&quot;]
  22. att = json_data[&quot;input_data&quot;][&quot;attributes&quot;]
  23. result = {}
  24. for i in range(len(title)):
  25. my_dict = {}
  26. for j in range(len(att)):
  27. attr = att[i][j]
  28. t, a = nobert4token(tokenizer, title[i].lower(), attr)
  29. x = X_padding(t)
  30. y = tag_padding(a)
  31. tensor_a = torch.tensor(y, dtype=torch.int32)
  32. tensor_a = torch.unsqueeze(tensor_a, dim=0).to(&quot;cuda&quot;)
  33. tensor_t = torch.tensor(x, dtype=torch.int32)
  34. tensor_t = torch.unsqueeze(tensor_t, dim=0).to(&quot;cuda&quot;)
  35. output = model([tensor_t, tensor_a])
  36. predict_list = output.tolist()[0]
  37. my_dict[attr] = &quot; &quot;.join(words_p)
  38. result[title[i]] = my_dict
  39. return result

My invoke script-

  1. ml_client.online_endpoints.invoke(
  2. endpoint_name=endpoint_result.name,
  3. deployment_name=green_deployment_uc1.name,
  4. request_file=os.path.join(&quot;./dependencies&quot;, &quot;sample.json&quot;),
  5. )

My conda.yaml-

  1. channels:
  2. - conda-forge
  3. dependencies:
  4. - python=3.8
  5. - pip=22.1.2
  6. - numpy=1.21.2
  7. - scikit-learn=0.24.2
  8. - scipy=1.7.1
  9. - &#39;pandas&gt;=1.1,&lt;1.2&#39;
  10. - pytorch=1.10.0
  11. - pip:
  12. - &#39;inference-schema[numpy-support]==1.5.0&#39;
  13. - xlrd==2.0.1
  14. - mlflow== 1.26.1
  15. - azureml-mlflow==1.42.0
  16. - tqdm==4.63.0
  17. - pytorch-transformers==1.2.0
  18. - pytorch-lightning==2.0.2
  19. - seqeval==1.2.2
  20. - azureml-inference-server-http==0.8.0
  21. name: model-env

Error that I am getting -

  1. 127.0.0.1 - - [29/May/2023:10:03:32 +0000] &quot;GET / HTTP/1.0&quot; 200 7 &quot;-&quot; &quot;kube-probe/1.18&quot;
  2. 2023-05-29 10:03:34,291 E [70] azmlinfsrv - Encountered Exception: Traceback (most recent call last):
  3. File &quot;/azureml-envs/azureml_d587e0800be72e17d773ddca63762cd1/lib/python3.8/site-packages/azureml_inference_server_http/server/user_script.py&quot;, line 130, in invoke_run
  4. run_output = self._wrapped_user_run(**run_parameters, request_headers=dict(request.headers))
  5. File &quot;/azureml-envs/azureml_d587e0800be72e17d773ddca63762cd1/lib/python3.8/site-packages/azureml_inference_server_http/server/user_script.py&quot;, line 154, in &lt;lambda&gt;
  6. self._wrapped_user_run = lambda request_headers, **kwargs: self._user_run(**kwargs)
  7. File &quot;/var/azureml-app/dependencies/score.py&quot;, line 129, in run
  8. tensor_a = torch.unsqueeze(tensor_a, dim=0).to(&quot;cuda&quot;)
  9. File &quot;/azureml-envs/azureml_d587e0800be72e17d773ddca63762cd1/lib/python3.8/site-packages/torch/cuda/__init__.py&quot;, line 247, in _lazy_init
  10. torch._C._cuda_init()
  11. RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx
  12. The above exception was the direct cause of the following exception:

If you think why I used "model = mlflow.pytorch.load_model(model_path, map_location=torch.device('cpu'))"

please refer to this forum- https://learn.microsoft.com/en-us/answers/questions/1291498/facing-problem-while-deploying-model-on-azure-ml-a

Documentation - https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-mlflow-models-online-endpoints?view=azureml-api-2&amp;tabs=sdk

答案1

得分: 0

为了解决这个问题,您可以修改您的代码,确保张量加载到CPU而不是GPU上。
在您的代码中添加设备变量:

  1. device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

在**run()**函数中替换以下代码:

  1. tensor_a = torch.tensor(y, dtype=torch.int32)
  2. tensor_a = torch.unsqueeze(tensor_a, dim=0).to(device)
  3. tensor_t = torch.tensor(x, dtype=torch.int32)
  4. tensor_t = torch.unsqueeze(tensor_t, dim=
  5. <details>
  6. <summary>英文:</summary>
  7. To solve this issue, you can modify your code to ensure that the tensors are loaded onto the CPU instead of the GPU.
  8. Add device variable in your code:
  9. device = torch.device(&quot;cuda&quot; if torch.cuda.is_available() else &quot;cpu&quot;)
  10. Replace below code in **run ()** function:
  11. tensor_a = torch.tensor(y, dtype=torch.int32)
  12. tensor_a = torch.unsqueeze(tensor_a, dim=0).to(&quot;device&quot;)
  13. tensor_t = torch.tensor(x, dtype=torch.int32)
  14. tensor_t = torch.unsqueeze(tensor_t, dim=0).to(&quot;device&quot;)
  15. Below is the example for the error and fix.
  16. Error Reproduced;
  17. [![enter image description here][1]][1]
  18. Fix:
  19. [![enter image description here][2]][2]
  20. [1]: https://i.stack.imgur.com/kSi5E.png
  21. [2]: https://i.stack.imgur.com/CzwnO.png
  22. </details>

huangapple
  • 本文由 发表于 2023年5月29日 18:44:30
  • 转载请务必保留本文链接:https://go.coder-hub.com/76356663.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定