2023年6月15日 19:22:35go评论110阅读模式

英文:

Custom loss in XGBoost is not updating

问题

Output

模型运行，但输出似乎卡住了，无论如何都没有变化：

[0]	validation_0-logloss:0.69315	validation_1-logloss:0.69315
[1]	validation_0-logloss:0.69315	validation_1-logloss:0.69315
[2]	validation_0-logloss:0.69315	validation_1-logloss:0.69315
[3]	validation_0-logloss:0.69315	validation_1-logloss:0.69315

Comments

可能我的导数计算不正确，尽管我已经仔细检查过。然而，即使将grad和hess更改为常数，也没有任何变化。
这里的Hessian矩阵（这是其数学定义）可能与XGBoost期望的1D数组不符（我认为它是对角线）。然而，由于第1点，即使我将其更改为1D数组，也没有任何变化。
本质上，这个模型总是预测为零，并且根本不更新。
更改（虚假）数据集的大小不会导致对数损失的任何变化（更糟糕的是，数字完全相同）。
有趣的是，验证集和训练集的对数损失是相同的，这是另一个信号，说明某些地方出了问题。
如果我切换到标准的对数损失（内置的），它会更新（输出是随机的，因为数据集是随机的）。

Question

我的实现有什么问题？XGBoost文档非常难以理解，我真的无法确定是否漏掉了一个简单的基本构建块。

英文:

Context

I am trying to use a custom loss function for an XGBoost binary classifier.

The idea was to implement in XGBoost the soft-Fbeta loss, which I read about here. Simply put: instead of using the standard logloss, use a loss function that directly optimises the Fbeta score.

Caveat

Of course, the Fbeta itself is not differentiable, so it can't be used straight out of the box. However, the idea is to use the probabilities (hence, before thresholding) to create some sort of continuous TP, FP and FN. Find more details in the referenced Medium article.

Attempt

My attempt was the following (inspired by few different people).

import numpy as np
import xgboost as xgb
def gradient(y: np.array, p: np.array, beta: float):
    &quot;&quot;&quot;Compute the gradient of the loss function. y is the true label, p
    the probability predicted by the model &quot;&quot;&quot;
    
    # Define the denominator
    D = p.sum() + beta**2 * y.sum() 
    
    # Compute the gradient
    grad = (1 + beta**2) * y / D - (1 + beta**2) * (np.dot(p, y)) / D**2 
        
    return grad
def hessian(y: np.array, p: np.array, beta: float):
    &quot;&quot;&quot;Compute the Hessian of the loss function. y is the true label, p
    the probability predicted by the model &quot;&quot;&quot;
    
    # Define the denominator
    D = p.sum() + beta**2 * y.sum() 
    
    # Tensor sum y_i + y_j
    tensor_sum = y + y[:, None]
    
    # Compute the hessian
    hess = (1 + beta**2) / D**2 * (-tensor_sum + 2*np.dot(p, y) / D)
    
    return hess
def f_smooth_loss(beta: float):
    
    &quot;&quot;&quot; Custom loss function for maximising F score&quot;&quot;&quot;
    def custom_loss(y: np.array, p: np.array):
                
        # Actual custom loss
        b = beta
        
        # Compute grad
        grad = - gradient(y, p, b)
        
        # Compute hessian
        hess = - hessian(y, p, b)
                  
        return grad, hess
        
    return custom_loss
# Random train dataset
X_train = np.random.rand(100, 100)
y_train = np.random.randint(0, 2, 100)
# Random validation dataset
X_validation = np.random.rand(1000, 100)
y_validation = np.random.randint(0, 2, 1000)
# Define a classifier trying to maximise F5 score
model = xgb.XGBClassifier(objective=f_smooth_loss(5))
# Fit
model.fit(X_train, y_train,  eval_set=[(X_train, y_train), (X_validation, y_validation)])

Output

The model runs, but the output is apparently stuck, no matter what:

[0]	validation_0-logloss:0.69315	validation_1-logloss:0.69315
[1]	validation_0-logloss:0.69315	validation_1-logloss:0.69315
[2]	validation_0-logloss:0.69315	validation_1-logloss:0.69315
[3]	validation_0-logloss:0.69315	validation_1-logloss:0.69315

Comments

It is possible my derivatives are not correct, even though I double checked them. However, even changing the grad and hess to constant numbers, nothing changes.
The Hessian here is a matrix (which would be its mathematical definition), but I think XGBoost expects a 1D array (I think it is the diagonal). However, because of point 1., nothing changes even if I change it to a 1d-array
Essentially, this model always predicts zeros, and does not update at all.
Changing the size of the (fake) dataset does not lead to any change in the logloss (even more, the numbers are exactly the same).
Curiously, the logloss is the same in the validation and train, this being yet another signal that there is something deeply wrong somewhere.
If I switch to the standard logloss (built-in), it updates (outputs are random, as the dataset is random).

Question

What is wrong in my implementation? XGB docs are pretty hard to decipher, and I can't really tell if I am missing a simple building block here.

答案1

得分: 2

问题是，根据文档，自定义损失函数需要以下参数作为输入：


....
def f_smooth_loss(beta: float):
    
    """用于最大化F分数的自定义损失函数"""
    def custom_loss(
        predt: np.ndarray,
        dtrain: xgb.DMatrix
    ) -> Tuple[np.ndarray, np.ndarray]:
                
        # 实际的自定义损失
        b = beta
        
        # 计算梯度
        grad = - gradient(dtrain, predt, b)
        
        # 计算Hessian矩阵
        hess = - hessian(dtrain, predt, b)
                  
        return grad, hess
        
    return custom_los

更新：根据所引用的文档，似乎需要在类的.train()方法中传递该函数，而不是在初始化模型时，例如：

xgb.train({'tree_method': 'hist', 'seed': 1994},  # 任何其他树方法都可以。
           dtrain=dtrain,
           num_boost_round=10,
           obj=f_smooth_loss(5))

另外，请注意，.fit()方法是XGBoost作为与其他sklearn对象（例如sklearn.pipeline）交互的接口而存在的包装器，因此它可能缺少此功能，最好使用本机方法.train()。

英文:

The problem is that following the docs the custom loss function need the following parameters as input:


....
def f_smooth_loss(beta: float):
    
    &quot;&quot;&quot; Custom loss function for maximising F score&quot;&quot;&quot;
    def custom_loss(
        predt: np.ndarray,
        dtrain: xgb.DMatrix
    ) -&gt; Tuple[np.ndarray, np.ndarray]:
                
        # Actual custom loss
        b = beta
        
        # Compute grad
        grad = - gradient(dtrain, predt, b)
        
        # Compute hessian
        hess = - hessian(dtrain, predt, b)
                  
        return grad, hess
        
    return custom_los

Update: following the documentation referenced about it seems that you need to pass the function in the .train() of the class not when initializing the model, e.g.:

xgb.train({&#39;tree_method&#39;: &#39;hist&#39;, &#39;seed&#39;: 1994},  # any other tree method is fine.
           dtrain=dtrain,
           num_boost_round=10,
           obj=f_smooth_loss(5))

Also, notice that the .fit() method is a wrapper that XGBoost has as a interface to interact with other sklearn objects (e.g. sklearn.pipeline) so it might lack this functionality, so it's better to use the native method .train().

答案2

得分: 0

请将分类器从objective=f_smooth_loss(5)更改为scoring=f_smooth_loss(5)：

model = xgb.XGBClassifier(scoring=f_smooth_loss(5))

英文:

Please change the classifier from objective=f_smooth_loss(5) to scoring=f_smooth_loss(5):

model = xgb.XGBClassifier(scoring = f_smooth_loss(5))

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

XGBoost中的自定义损失未更新。

问题

答案1

答案2

这个错误是什么？数值错误，使用基数10时int()无效文字：”

比较JSON文件中列表中的元素。

keyerror: 2

条件移动 pandas 列

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。