英文:
How can I implement RandomizedSearchCV for GradientBoostingRegressor in scikit-learn instead of GridSearchCV?
问题
我正在尝试使用sklearn的GradientBoostingRegressor运行回归模型。我已经看到一些用于超参数调优的GridSearchCV实现,但为了减少计算时间,我想实现RandomizedSearch。不幸的是,我无法让它们一起运行。你可以帮我实现吗?
我的GridSearchCV脚本如下,不幸的是,我无法将其转换为使用Gradient Boosting估计器的RandomizedSearchCV。
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import RandomizedSearchCV
print("Optimizing Hyperparameters..")
LR = {"learning_rate": [0.001],
"n_estimators": [1000, 3000, 5000, 7000, 10000],
"max_depth": [1, 2, 3, 5, 7, 10]}
tuning = RandomizedSearchCV(estimator=GradientBoostingRegressor(), param_distributions=LR)
tuning.fit(X_train, y_train)
print("Best Parameters found: ", tuning.best_params_)
n_parameter = tuning.best_params_["n_estimators"]
lr_parameter = tuning.best_params_["learning_rate"]
md_parameter = tuning.best_params_["max_depth"]
希望对你有所帮助。
英文:
I am trying to run a regression model using sklearn GradientBoostingRegressor. I have seen some GridSearchCV implementations for the hyperparameter tuning, however in order to reduce the computation time I would like to implement RandomizedSearch. Unfortunately I could not make these both run together. Could you please help me how to implement?
My script for GridSearchCV is below, I unfortunately could not manage it to convert to RandomizedSearchCV using gradient boosting estimator.
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import RandomizedSearchCV
print("Optimizing Hyperparameters..")
LR = {"learning_rate": [0.001],
# "n_estimators": [10, 50, 100, 150, 500, 1000, 15000],
"n_estimators": [1000, 3000, 5000, 7000, 10000],
"max_depth": [1, 2, 3, 5, 7, 10]}
tuning = RandomizedSearchCV(estimator=GradientBoostingRegressor(), param_distributions=LR)
tuning.fit(X_train, y_train)
print("Best Parameters found: ", tuning.best_params_)
n_parameter = tuning.best_params_["n_estimators"]
lr_parameter = tuning.best_params_["learning_rate"]
md_parameter = tuning.best_params_["max_depth"]
答案1
得分: 1
我运行了以下代码片段,一切正常。
import numpy as np
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import RandomizedSearchCV
# 基本概念验证数据集
X_val = np.random.randn(10, 4)
y_val = np.random.randn(10)
LR = {
"learning_rate": [0.001],
"n_estimators": [1000, 3000, 5000, 7000, 10000],
"max_depth": [1, 2, 3, 5, 7, 10],
}
tuning = RandomizedSearchCV(
estimator=GradientBoostingRegressor(), param_distributions=LR, scoring="r2"
)
tuning.fit(X_val, y_val)
print("Best Parameters found: ", tuning.best_params_)
它打印出:
Best Parameters found:
{'n_estimators': 3000, 'max_depth': 1, 'learning_rate': 0.001}
由于上面的代码片段工作正常,问题可能与您的脚本/笔记本中的其他内容有关。也许与您的数据集有关?
英文:
I ran the following code snippet and everything worked.
import numpy as np
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import RandomizedSearchCV
# basic proof of concept dataset
X_val = np.random.randn(10, 4)
y_val = np.random.randn(10)
LR = {
"learning_rate": [0.001],
"n_estimators": [1000, 3000, 5000, 7000, 10000],
"max_depth": [1, 2, 3, 5, 7, 10],
}
tuning = RandomizedSearchCV(
estimator=GradientBoostingRegressor(), param_distributions=LR, scoring="r2"
)
tuning.fit(X_val, y_val)
print("Best Parameters found: ", tuning.best_params_)
And it printed out
Best Parameters found:
{'n_estimators': 3000, 'max_depth': 1, 'learning_rate': 0.001}
Since the code snippet above works, it must be due to something else behind the scenes in your script/notebook. Perhaps it is something to do with your dataset?
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论