为什么我的模型中某些特征的SHAP值总是负数(或正数)?

huangapple go评论69阅读模式
英文:

Why are the SHAP values for some features in my model always negative (or positive)?

问题

我解释了来自sci-kit learn的HistGradientBoostingRegressor。我使用shap包中的TreeExplainer获取SHAP值。我的模型有48个特征,其中一些强相关。模型的蜂群总结图如下:

为什么我的模型中某些特征的SHAP值总是负数(或正数)?

一些特征的SHAP值一直为负(或正),而不是以0为中心,这是怎么可能的?

我以前从未见过这种情况。直觉上对我来说没有意义,因为模型可以添加一个常数来避免偏移。

英文:

I am explaining a HistGradientBoostingRegressor from sci-kit learn. I use the TreeExplainer from the shap package to get the SHAP values. My model has 48 features, some of which are strongly correlated. The beeswarm summary plot of the model looks like this:

为什么我的模型中某些特征的SHAP值总是负数(或正数)?

How is it possible that the SHAP values of some features are consistently negative (or positive) and not centered at 0?

I never saw this anywhere else. Intuitively it does not make sense to me, since the model could just add a constant to avoid the shift.

答案1

得分: 1

通常情况下,提出问题有助于发现答案。问题出现是因为我意外地对模型训练的数据进行了与传递给解释算法的数据不同的缩放。

类似的问题已在GitHub上得到解答:https://github.com/slundberg/shap/issues/553

英文:

As is often the case, formulating the question helped to discover the answer. The problem arises because i accidentally scaled the data the model was trained on in a different way than the data i passed to the explanation algorithm.

A similar question has been answered on github: https://github.com/slundberg/shap/issues/553

huangapple
  • 本文由 发表于 2023年5月17日 17:40:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/76270672.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定