问题

I've been doing the Applied Machine Learning in Python course on Coursera. On the assignment for week 4, I found something interesting. During my first attempt to complete the assignment, I tried using the RandomForestClassifier from sklearn to predict labels. However, the model was overfitting and showing poor test accuracy results. As an experiment, I switched to RandomForestRegressor. Surprisingly, not only did it not overfit, but the test accuracy was also much higher. So, why does RandomForestRegressor perform much better on a binary classification problem?

英文:

I've been doing Applied Machine Learing in Python course on coursera and on Assignment of week 4 I`ve found something interesting. During my first attempt to complete the assignment I tried using RandomForestClassifier from sklearn to predict labels, but the model was overfitting and was showing poor test accuracy results. As an experiment I switched to RandomForestRegressor and, guess what, not only did it not overfit, but test accurary was also a lot higher. So, why does RandomForestRegressor perform a lot better on a binary classification problem?

答案1

得分: 2

随机森林回归器在集成决策树时与随机森林分类器略有不同：

分类器使用决策树预测类别的众数
回归器使用决策树预测值的平均值

由于这种差异，模型的结果可能会不同。在某些情况下，这可能导致回归器的性能优于分类器。

此外，我要说的是，如果您正确调整超参数，分类问题上的分类器应该表现得更好。

英文:

The Random Forest regressor does differ somewhat from the Random Forest classifier when it comes to ensembling the decision trees:

The classifier uses the mode of the predicted classes of the decision trees
The regressor uses the mean of the predicted values of the decision trees

Due to this difference the models can have different results. And in some cases this might result in the regressor performing better than the classifier.

In addition to that I would say that if you tune your hyperparameters correctly, the classifier should perform better on a classification problem than the regressor.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

RandomForestRegressor 用于分类问题

问题

答案1

处理Python中的行继续和转义字符

混合使用Python和Go语言

如何使用 Python 的多方法（multimethod）与自定义类作为参数类型。

如何在一种语言中生成与另一种语言相同的随机数？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论