英文:
Julia: Remove or add key from GroupedDataFrame
问题
Sure, here are the translated parts:
Case 1
df = DataFrame(rand(160, 3), :auto)
rename!(df, [:A, :B, :Z])
@. df.B = ifelse(rand() < 0.5, 1, 2)
@. df.A = ifelse(rand() < 0.5, 1, 2)
# I group here by A and B
gd = groupby(df, [:A, :B])
#=
我根据A和B对df执行分组操作。
...但现在我只需要基于B执行操作。
=#
How to remove key A?
gd.removegroup([:A])
gd.removekey([:A])
gd.ungroup([:A])
Case 2
df = DataFrame(rand(160, 3), :auto)
rename!(df, [:A, :B, :Z])
@. df.B = ifelse(rand() < 0.5, 1, 2)
@. df.A = ifelse(rand() < 0.5, 1, 2)
# I group here by B
gd = groupby(df, [:B])
#=
我根据B对df执行分组操作。
...但现在我需要基于B和A执行操作。
=#
How to add key A?
groupby(gd, [:A]) ❌❌❌❌
gd.addkey([:A])
gd.addgroup([:A])
Please note that the ❌❌❌❌ indicates that the specific line is not a valid Julia code for adding key A.
英文:
Imagine I have a df on which I need to perform operations based on grouped columns. But I need two perform actions based on two groupings.
Having cols A, B, C
I need to do operation x
to df grouped by A, B
and operation y
to df grouped only by B
. Do I need to group the dataframe twice?
Case 1
df=DataFrame(rand(160,3), :auto)
rename!(df,[:A,:B,:Z])
@. df.B = ifelse(rand() < 0.5, 1, 2)
@. df.A = ifelse(rand() < 0.5, 1, 2)
# I group here by A and B
gd = groupby(df, [:A, :B])
#=
My operations with df grouped by A and B.
... But now I need to perform only with B
=#
How to remove key A?
gd.removegroup([:A])
gd.removekey([:A])
gd.ungroup([:A])
Case 2
df=DataFrame(rand(160,3), :auto)
rename!(df,[:A,:B,:Z])
@. df.B = ifelse(rand() < 0.5, 1, 2)
@. df.A = ifelse(rand() < 0.5, 1, 2)
# I group here by B
gd = groupby(df, [:B])
#=
My operations with df grouped by B.
... But now I need to perform with B and A
=#
How to add key A?
groupby(gd, [:A]) ❌❌❌❌
gd.addkey([:A])
gd.addgroup([:A])
答案1
得分: 0
Sure, here's the translation of the provided text:
"我需要两次分组数据框吗?
需要。两次分组与添加/删除分组一样编码量相同。只需执行以下操作:
gd1 = groupby(df, [:A, :B])
gd2 = groupby(df, :B)
由于分组的数据框是源 df
的视图,如果您更改 gd1
,更改将自动反映在 gd2
中。
您唯一需要记住的是,在更改 df
时,不应更改列 :A
和 :B
,因为更改分组列可能会使分组无效。"
英文:
> Do I need to group the dataframe twice?
Yes. Grouping twice is the same amount of coding as adding/removing group. Just do e.g.:
gd1 = groupby(df, [:A, :B])
gd2 = groupby(df, :B)
Since grouped data frame is a view of source df
, if you mutate gd1
the changes will be reflected in gd2
automatically.
The only thing you need to keep in mind that when mutating df
you should not mutate columns :A
and :B
as mutating grouping columns could invalidate groupings.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论