英文:
Saving H2O GridSeach as CSV
问题
import h2o
from h2o.estimators.gbm import H2OGradientBoostingEstimator
from h2o.grid.grid_search import H2OGridSearch
h2o.init()
data = h2o.import_file('dataset.csv')
train, test = train.split_frame(ratios=[0.8])
n_trees = [50, 100, 200, 300]
max_depth = [5, 6, 7]
learn_rate = [0.01, 0.05, 0.1]
min_rows = [10, 15, 20]
min_split_improvement = [0.00001, 0.0001]
hyper_parameters = {"ntrees": n_trees,
"max_depth": max_depth,
"learn_rate": learn_rate,
"min_rows": min_rows}
gs = H2OGridSearch(model=H2OGradientBoostingEstimator, hyper_params=hyper_parameters)
gs.train(x=train.columns, y=target_column, training_frame=train, validation_frame=test, distribution='bernoulli')
grid_perf = gs.get_grid(sort_by='auc', decreasing=True)
要将grid_perf的结果保存为CSV文件,您可以尝试以下代码:
h2o.download_csv(grid_perf, 'grid_search_results.csv')
英文:
I have the following code:
import h2o
from h2o.estimators.gbm import H2OGradientBoostingEstimator
from h2o.grid.grid_search import H2OGridSearch
h2o.init()
data=h2o.import_file('dataset.csv')
train,test= train.split_frame(ratios=[0.8])
n_trees = [50, 100, 200, 300]
max_depth = [5, 6, 7]
learn_rate = [0.01, 0.05, 0.1]
min_rows = [10,15,20]
min_split_improvement = [0.00001, 0.0001]
hyper_parameters = {"ntrees":n_trees,
"max_depth":max_depth,
"learn_rate":learn_rate,
"min_rows":min_rows}
gs=H2OGridSearch(model=H2OGradientBoostingEstimator, hyper_params=hyper_parameters)
gs.train(x=train.columns, y=target_column, training_frame=train, validation_frame=test, distribution='bernoulli')
grid_perf=gs.get_grid(sort_by='auc',decreasing=True)
This produces a grid search of GBMs on the dataset.
I want to be able to save the result of the grid search, grid_perf, as a csv.
Something along the lines of:
h2o.export_file(grid_perf,'grid_search_results.csv')
Note: the code above works, so no debugging necessary, thanks.
Tried using the above line, but it gives me a Argument python_obj should be a None | list | tuple | dict | numpy.ndarray | pandas.DataFrame | scipy.sparse.issparse, got H2OGridSearch
error.
答案1
得分: 1
grid_perf._grid_json
可以适用于您的情况吗?
也许 _grid_json["summary_table"]
?
英文:
Would grid_perf._grid_json
work for your case?
Maybe _grid_json["summary_table"]
?
答案2
得分: 0
感谢Adam Valenta的建议。
使用这个建议,解决方案如下:
grid_perf = gs.get_grid(sort_by='auc', decreasing=True)
table = grid_perf._grid_json['summary_table'].as_data_frame()
table.to_csv('GridSearch1.csv', index=False)
英文:
Thanks to Adam Valenta for the suggestion.
Using that, the solution is:
grid_perf=gs.get_grid(sort_by='auc', decreasing=True)
table = grid_perf._grid_json['summary_table'].as_data_frame()
table.to_csv('GridSearch1.csv',index=False)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论