2023年2月8日 23:36:59go评论95阅读模式

英文:

Using sample_weight param with XGBoost through a pipeline

问题

我想在xgboost包的XGBClassifier中使用sample_weight参数。

问题出现在我尝试在sklearn.pipeline中使用它时。

from sklearn.preprocessing import MinMaxScaler
from sklearn.pipeline import Pipeline
from xgboost import XGBClassifier
clf = XGBClassifier(**params)
steps = [('scaler', MinMaxScaler()), ('classifier', clf)]
pipeline = Pipeline(steps)

当我运行pipeline.fit(x, y, sample_weight=sample_weight)，其中sample_weight只是一个包含整数表示权重的字典时，我遇到了以下错误：

ValueError: Pipeline.fit不接受sample_weight参数。

我该如何解决这个问题？有没有解决方法？我已经看到有一个问题存在。

英文:

I want to use the sample_weight parameter with XGBClassifier from the xgboost package.

The problem happen when I want to use it inside a pipeline from sklearn.pipeline.

from sklearn.preprocessing import MinMaxScaler
from sklearn.pipeline      import Pipeline
from xgboost  import XGBClassifier
clf = XGBClassifier(**params)
steps = [ (&#39;scaler&#39;, MinMaxScaler() ), (&#39;classifier&#39;, clf ) ]
    
pipeline = Pipeline( steps )

When I run pipeline.fit(x, y, sample_weight=sample_weight) where sample_weight is just a dictionary with int representing weights, I have the following error:

> ValueError: Pipeline.fit does not accept the sample_weight parameter.

How can I solve this problem? Is there a workaround? I have seen that an issue already exists.

答案1

得分: 2

The value error message is factually correct - the Pipeline class does not contain any business logic dealing with sample weights.

However, your pipeline has two steps. And one of the step components - the XGBoost classifier - supports sample weights.

So, the solution is to address the sample weights parameter directly to the classifier step. According to Scikit-Learn conventions, you can do so by prepending the classifier__ prefix (reads "classifier" plus two underscore characters) to your fit param name.

In short:

pipeline = Pipeline( steps )
pipeline.fit(X, y, classifier__sample_weights = weights)

英文:

The value error message is factually correct - the Pipeline class does not contain any business logic dealing with sample weights.

However, your pipeline has two steps. And one of the step components - the XGBoost classifier - supports sample weights.

In short:

pipeline = Pipeline( steps )
pipeline.fit(X, y, classifier__sample_weights = weights)

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

使用管道通过 sample_weight 参数与 XGBoost 配合使用

问题

答案1

我如何衡量一句话与其否定含义之间的语义相似性？

无法在 except Exception as e 之后使用 driver.execute_script()

HttpResponseError: 此请求未获授权以使用Python Azure Function中的此权限执行此操作

如何纠正我对子情节的误解

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。