2023年6月2日 08:03:00go评论102阅读模式

英文:

How can I implement RandomizedSearchCV for GradientBoostingRegressor in scikit-learn instead of GridSearchCV?

问题

我正在尝试使用sklearn的GradientBoostingRegressor运行回归模型。我已经看到一些用于超参数调优的GridSearchCV实现，但为了减少计算时间，我想实现RandomizedSearch。不幸的是，我无法让它们一起运行。你可以帮我实现吗？

我的GridSearchCV脚本如下，不幸的是，我无法将其转换为使用Gradient Boosting估计器的RandomizedSearchCV。

from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import RandomizedSearchCV

print("Optimizing Hyperparameters..")
LR = {"learning_rate": [0.001],
      "n_estimators": [1000, 3000, 5000, 7000, 10000],
      "max_depth": [1, 2, 3, 5, 7, 10]}
tuning = RandomizedSearchCV(estimator=GradientBoostingRegressor(), param_distributions=LR)
tuning.fit(X_train, y_train)

print("Best Parameters found: ", tuning.best_params_)

n_parameter = tuning.best_params_["n_estimators"]
lr_parameter = tuning.best_params_["learning_rate"]
md_parameter = tuning.best_params_["max_depth"]

希望对你有所帮助。

英文:

I am trying to run a regression model using sklearn GradientBoostingRegressor. I have seen some GridSearchCV implementations for the hyperparameter tuning, however in order to reduce the computation time I would like to implement RandomizedSearch. Unfortunately I could not make these both run together. Could you please help me how to implement?

My script for GridSearchCV is below, I unfortunately could not manage it to convert to RandomizedSearchCV using gradient boosting estimator.

from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import RandomizedSearchCV

print(&quot;Optimizing Hyperparameters..&quot;)
LR = {&quot;learning_rate&quot;: [0.001],
      # &quot;n_estimators&quot;: [10, 50, 100, 150, 500, 1000, 15000],
      &quot;n_estimators&quot;: [1000, 3000, 5000, 7000, 10000],
      &quot;max_depth&quot;: [1, 2, 3, 5, 7, 10]}
tuning = RandomizedSearchCV(estimator=GradientBoostingRegressor(), param_distributions=LR)
tuning.fit(X_train, y_train)

print(&quot;Best Parameters found: &quot;, tuning.best_params_)

n_parameter = tuning.best_params_[&quot;n_estimators&quot;]
lr_parameter = tuning.best_params_[&quot;learning_rate&quot;]
md_parameter = tuning.best_params_[&quot;max_depth&quot;]

答案1

得分: 1

我运行了以下代码片段，一切正常。

import numpy as np
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import RandomizedSearchCV

# 基本概念验证数据集
X_val = np.random.randn(10, 4)
y_val = np.random.randn(10)

LR = {
    "learning_rate": [0.001],
    "n_estimators": [1000, 3000, 5000, 7000, 10000],
    "max_depth": [1, 2, 3, 5, 7, 10],
}
tuning = RandomizedSearchCV(
    estimator=GradientBoostingRegressor(), param_distributions=LR, scoring="r2"
)
tuning.fit(X_val, y_val)
print("Best Parameters found: ", tuning.best_params_)

它打印出：

Best Parameters found:  
{'n_estimators': 3000, 'max_depth': 1, 'learning_rate': 0.001}

由于上面的代码片段工作正常，问题可能与您的脚本/笔记本中的其他内容有关。也许与您的数据集有关？

英文:

I ran the following code snippet and everything worked.

import numpy as np
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import RandomizedSearchCV

# basic proof of concept dataset
X_val = np.random.randn(10, 4)
y_val = np.random.randn(10)

LR = {
    &quot;learning_rate&quot;: [0.001],
    &quot;n_estimators&quot;: [1000, 3000, 5000, 7000, 10000],
    &quot;max_depth&quot;: [1, 2, 3, 5, 7, 10],
}
tuning = RandomizedSearchCV(
    estimator=GradientBoostingRegressor(), param_distributions=LR, scoring=&quot;r2&quot;
)
tuning.fit(X_val, y_val)
print(&quot;Best Parameters found: &quot;, tuning.best_params_)

And it printed out

Best Parameters found:  
{&#39;n_estimators&#39;: 3000, &#39;max_depth&#39;: 1, &#39;learning_rate&#39;: 0.001}

Since the code snippet above works, it must be due to something else behind the scenes in your script/notebook. Perhaps it is something to do with your dataset?

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

How can I implement RandomizedSearchCV for GradientBoostingRegressor in scikit-learn instead of GridSearchCV?

问题

答案1

`warnings.filterwarnings()`无法抑制SGDClassifier的ConvergenceWarning。

Vertex AI在部署模型端点时为什么不考虑机器类型？

在Keras中如何预测一个值列表？

sklearn.decomposition.PCA 中 explained_variance_ 的用法

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论