英文:
Why are the SHAP values for some features in my model always negative (or positive)?
问题
我解释了来自sci-kit learn的HistGradientBoostingRegressor
。我使用shap
包中的TreeExplainer获取SHAP值。我的模型有48个特征,其中一些强相关。模型的蜂群总结图如下:
一些特征的SHAP值一直为负(或正),而不是以0为中心,这是怎么可能的?
我以前从未见过这种情况。直觉上对我来说没有意义,因为模型可以添加一个常数来避免偏移。
英文:
I am explaining a HistGradientBoostingRegressor
from sci-kit learn. I use the TreeExplainer from the shap
package to get the SHAP values. My model has 48 features, some of which are strongly correlated. The beeswarm summary plot of the model looks like this:
How is it possible that the SHAP values of some features are consistently negative (or positive) and not centered at 0?
I never saw this anywhere else. Intuitively it does not make sense to me, since the model could just add a constant to avoid the shift.
答案1
得分: 1
通常情况下,提出问题有助于发现答案。问题出现是因为我意外地对模型训练的数据进行了与传递给解释算法的数据不同的缩放。
类似的问题已在GitHub上得到解答:https://github.com/slundberg/shap/issues/553
英文:
As is often the case, formulating the question helped to discover the answer. The problem arises because i accidentally scaled the data the model was trained on in a different way than the data i passed to the explanation algorithm.
A similar question has been answered on github: https://github.com/slundberg/shap/issues/553
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论