从Google云存储加载已保存的XGBoost模型(.bst)

huangapple go评论109阅读模式
英文:

Load Saved XGBBoost Model (.bst) from Google Cloud Storage

问题

以下是翻译好的内容:

# 导入必要的库
import xgboost as xgb

# 加载模型
model = xgb.Booster(model_file=model_path)
英文:

I trained and saved a XGBoost model on Google cloud storage as "model.bst" file from a Vertex AI (Kubeflow) pipeline component, and I try to load it from a Notebook in Vertex AI.

import xgboost as xgb
model = xgb.train(param, dtrain, num_boost_round=boost_rounds)
model.save_model(model_path)

I tried with different solutions, all ending in different errors. For example:

fs = gcsfs.GCSFileSystem()
with fs.open(model_path, "rb") as f:
    model = model.load_model(f)
f.close()

The error with this solution is:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/tmp/ipykernel_542318/3872730142.py in <module>
     14 
     15 with fs.open(model_path, "rb") as f:
---> 16     model = model.load_model(f)
     17 f.close()
     18 

/opt/conda/lib/python3.7/site-packages/xgboost/core.py in load_model(self, fname)
   2301                                                           length))
   2302         else:
-> 2303             raise TypeError('Unknown file type: ', fname)
   2304 
   2305         if self.attr("best_iteration") is not None:

TypeError: ('Unknown file type: ', <File-like object GCSFileSystem, bucket/model.bst>)

Another try:

from pathlib import Path

path_to_model = 'gs://...../model.bst'
path = Path(path_to_model)
booster_from_file = xgb.Booster(path)

Error: TypeError: 'PosixPath' object does not support item assignment

Another try, no error but: the loaded object has None type!

from google.cloud import storage
storage_client = storage.Client()

bucket = 'bucket_name'
bucket_obj=storage_client.bucket(bucket)

path = '../model/.../md-5-lr-0.05-br-300/model.bst'
blob=bucket_obj.blob(path)

# Download blob into an in-memory file object
model_file = 'model.bst' #BytesIO()
blob.download_to_filename(model_file)

# Load model from in-memory file object
from_file = xgb.Booster() 
model_name = "model.bst"
model = from_file.load_model(model_name)
print(type(model))

# NoneType!
<class 'NoneType'>

答案1

得分: 2

import xgboost as xgb
from google.cloud import storage

client = storage.Client()
bucket = client.get_bucket(bucket_name)
blob = bucket.blob(blob_name)
blob.download_to_filename(local_file_name)

bst = xgb.Booster(model_file=local_file_name)

英文:
import xgboost as xgb
from google.cloud import storage

client = storage.Client()
bucket = client.get_bucket(bucket_name)
blob = bucket.blob(blob_name)
blob.download_to_filename(local_file_name)

bst = xgb.Booster(model_file=local_file_name)

Also, per this documentation, "If your Cloud Storage bucket is in the same project you're using for AI Platform Training, then AI Platform Training can read from and write to your bucket. If not, you need to make sure that the project you are using to run AI Platform Training can access your Cloud Storage bucket..."

huangapple
  • 本文由 发表于 2023年4月19日 22:31:58
  • 转载请务必保留本文链接:https://go.coder-hub.com/76055728.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定