在Julia中修改DataFrame

huangapple go评论69阅读模式
英文:

Mutate DataFrames in Julia

问题

寻找一个类似于 by 但不会合并我的DataFrame的函数。在R中,我会使用 dplyrgroupby(b) %>% mutate(x1 = sum(a))。我不想丢失表格中的信息,比如变量 :c

mydf = DataFrame(a = 1:4, b = repeat(1:2,2), c=4:-1:1)
bypreserve(mydf, :b,  x -> sum(x.a))

结果如下:

│ Row │ a     │ b     │ c     │ x1    │
│     │ Int64 │ Int64 │ Int64 │ Int64 │
├─────┼───────┼───────┼───────┼───────┤
│ 1   │ 1     │ 1     │ 4     │ 4     │
│ 2   │ 2     │ 2     │ 3     │ 6     │
│ 3   │ 3     │ 1     │ 2     │ 4     │
│ 4   │ 4     │ 2     │ 1     │ 6     │
英文:

Looking for a function that works like by but doesn't collapse my DataFrame. In R I would use dplyr's groupby(b) %>% mutate(x1 = sum(a)). I don't want to lose information from the table such as that in variable :c.

mydf = DataFrame(a = 1:4, b = repeat(1:2,2), c=4:-1:1)
bypreserve(mydf, :b,  x -> sum(x.a))
│ Row │ a     │ b     │ c     │ x1
│     │ Int64 │ Int64 │ Int64 │Int64 
├─────┼───────┼───────┼───────┤───────
│ 1   │ 1     │ 1     │ 4     │ 4
│ 2   │ 2     │ 2     │ 3     │ 6
│ 3   │ 3     │ 1     │ 2     │ 4
│ 4   │ 4     │ 2     │ 1     │ 6

答案1

得分: 4

将此功能添加讨论了一下,但我认为需要几个月才能发布(一般的想法是允许select具有groupby关键字参数,并添加transform函数,其工作方式类似于select但保留源数据帧的列)。

目前的解决方法是在by之后使用join

join(mydf, by(mydf, :b, x1 = :a => sum), on=:b)
英文:

Adding this functionality is discussed, but I would say that it will take several months to be shipped (the general idea is to allow select to have groupby keyword argument + also add transform function that will work like select but preserve columns of the source data frame).

For now the solution is to use join after by:

join(mydf, by(mydf, :b, x1 = :a => sum), on=:b)

huangapple
  • 本文由 发表于 2020年1月6日 23:54:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/59615138.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定