Pandas – 在组内使用来自组的值进行缩放

huangapple go评论61阅读模式
英文:

Pandas - Scale within a group using a value from the group

问题

我的数据包括一些分组,这些分组接受了各种不同的处理,然后测量了一些结果,类似于以下示例:

X = pd.DataFrame({
    'group':['A','A','A','B','B','B'],
    'treatment':['control', 'high_dose', 'low_dose', 'control', 'high_dose', 'low_dose'],
    'result':[2, 6, 4, 3, 12, 15]})

我想要在每个组内使用该组内的控制值来对结果进行缩放,以获得如下结果:

      group  treatment  result  result_group_stand
    0     A    control       2                    1
    1     A  high_dose       6                    3
    2     A   low_dose       4                    2
    3     B    control       3                    1
    4     B  high_dose      12                    4
    5     B   low_dose      15                    5

在这里,组“A”的每个结果都已经按照控制值2进行了缩放,组“B”的每个值都已经按照控制值3进行了缩放。所有我看到的示例都是使用groupby来按照汇总测量(求和、最大值、最小值等)来进行缩放的,但我找不到一个使用组内特定处理值的示例。感谢任何帮助。

英文:

My data consist of groups which have received a variety of treatments and then had some result measured, similar to this:

X = pd.DataFrame({
'group':['A','A','A','B','B','B'],
'treatment':['control', 'high_dose', 'low_dose', 'control', 'high_dose', 'low_dose'],
'result':[2, 6, 4, 3, 12, 15]})

  group  treatment  result
0     A    control       2
1     A  high_dose       6
2     A   low_dose       4
3     B    control       3
4     B  high_dose      12
5     B   low_dose      15

I would like to scale the results within each group using the control value within each group to achieve a result like this:

  group  treatment  result  result_group_stand
0     A    control       2                    1
1     A  high_dose       6                    3
2     A   low_dose       4                    2
3     B    control       3                    1
4     B  high_dose      12                    4
5     B   low_dose      15                    5

Where every result in group "A" has been scaled by the Control value of 2, and every value in group "B" has been scaled by the Control value of 3. All of the examples I have seen use groupby to scale by a summary measurement (sum, max, min, etc...), but I cant find an example that uses a value of a specific treatment within the group. Thanks for any help.

答案1

得分: 2

使用布尔索引set_indexmap进行映射:

X['result_group_stand'] = (X['result']
                           .div(X['group']
                                .map(X[X['treatment'].eq('control')]
                                      .set_index('group')['result'])
                               )
                          )

或者使用groupby.transform

X['result_group_stand'] = (X['result']
                           .div(X['result'].where(X['treatment'].eq('control'))
                                .groupby(X['group']).transform('first')
                               )
                          )

输出:

  group  treatment  result  result_group_stand
0     A    control       2                 1.0
1     A  high_dose       6                 3.0
2     A   low_dose       4                 2.0
3     B    control       3                 1.0
4     B  high_dose      12                 4.0
5     B   low_dose      15                 5.0
英文:

Use a mapping with boolean indexing, set_index and map:

X['result_group_stand'] = (X['result']
                           .div(X['group']
                                .map(X[X['treatment'].eq('control')]
                                      .set_index('group')['result'])
                               )
                          )

Or with groupby.transform:

X['result_group_stand'] = (X['result']
                           .div(X['result'].where(X['treatment'].eq('control'))
                                .groupby(X['group']).transform('first')
                               )
                          )

Output:

  group  treatment  result  result_group_stand
0     A    control       2                 1.0
1     A  high_dose       6                 3.0
2     A   low_dose       4                 2.0
3     B    control       3                 1.0
4     B  high_dose      12                 4.0
5     B   low_dose      15                 5.0

huangapple
  • 本文由 发表于 2023年6月1日 20:43:35
  • 转载请务必保留本文链接:https://go.coder-hub.com/76381996.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定