英文:
Sensitive feature in Fairlearn
问题
我正在使用类似于以下方式的Fairlearn函数:
eor = fairlearn.metrics.equalized_odds_ratio(y_true, y_pred, sensitive_features=sensitive_feature)
dpd = fairlearn.metrics.demographic_parity_difference(y_true, y_pred, sensitive_features=sensitive_feature)
di = fairlearn.metrics.demographic_parity_ratio(y_true, y_pred, sensitive_features=sensitive_feature)
其中,y_pred 是表示计算出的预测的二进制值,y_true 也是表示真实标签的二进制值,sensitive_feature 是一个包含1和0的二进制数据框,例如,如果要衡量年轻组和老年组的指标,1代表年轻,0代表老年,那么老年组就是受保护的组。如果年轻组是受保护的组,那么我是否需要反转数据框 sensitive_feature 中的列,并再次将其提供给 Fairlearn 函数?
英文:
I am using the Fairlearn functions similar to this:
eor = fairlearn.metrics.equalized_odds_ratio(y_true, y_pred, sensitive_features=sensitive_feature)
dpd = fairlearn.metrics.demographic_parity_difference(y_true, y_pred, sensitive_features=sensitive_feature)
di = fairlearn.metrics.demographic_parity_ratio(y_true, y_pred, sensitive_features=sensitive_feature)
where y_pred is a binary representing the computed predictions, y_true is also binary representing the truth labels, and sensitive_feature is a binary dataframe consisting of one column of 1's and 0's, for example if measuring the metrics for the groups young and old, 1 would represent young and 0 would represent old, old is then the protected group. What if young is the protected group? Do then I have to invert the column in my dataframe sensitive_feature and supply it again to the Fairlearn functions?
答案1
得分: 0
Fairlearn维护者在这里!
不,你不需要改变任何东西。这对这些函数的结果并不重要。例如,人口统计平等只关注y_pred并忽略y_true。假设"young"的选择率(1的百分比)为0.8,而"old"的选择率为0.6。人口统计平等差异将始终为最大值减去最小值,即0.8-0.6=0.2。即使另一组是最大值,结果也是一样的。对于比率,它是最小值除以最大值,所以0.6/0.8=0.75。
如果你有超过2个组,它仍然会起作用,但它只考虑最大值和最小值组。任何在中间的组都不会被这个特定的度量表示。
英文:
Fairlearn maintainer here!
No, you don't need to change anything. It doesn't matter for the outcome of these functions. For example, demographic parity just looks at y_pred and ignores y_true. Let's say "young" has a selection rate (percentage of 1s) of 0.8 and "old" has a selection rate of 0.6. The demographic parity difference will always be max-min, that is 0.8-0.6=0.2. Even if the other group is the max it's the same outcome. For the ratio, it's min/max, so 0.6/0.8=0.75.
If you have more than 2 groups it'll still work, but it only considers the max and min groups. Any groups in between won't be represented by this particular measure.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论