在Julia中修改DataFrame

huangapple go评论88阅读模式
英文:

Mutate DataFrames in Julia

问题

寻找一个类似于 by 但不会合并我的DataFrame的函数。在R中,我会使用 dplyrgroupby(b) %>% mutate(x1 = sum(a))。我不想丢失表格中的信息,比如变量 :c

  1. mydf = DataFrame(a = 1:4, b = repeat(1:2,2), c=4:-1:1)
  2. bypreserve(mydf, :b, x -> sum(x.a))

结果如下:

  1. Row a b c x1
  2. Int64 Int64 Int64 Int64
  3. ├─────┼───────┼───────┼───────┼───────┤
  4. 1 1 1 4 4
  5. 2 2 2 3 6
  6. 3 3 1 2 4
  7. 4 4 2 1 6
英文:

Looking for a function that works like by but doesn't collapse my DataFrame. In R I would use dplyr's groupby(b) %>% mutate(x1 = sum(a)). I don't want to lose information from the table such as that in variable :c.

  1. mydf = DataFrame(a = 1:4, b = repeat(1:2,2), c=4:-1:1)
  2. bypreserve(mydf, :b, x -> sum(x.a))
  3. Row a b c x1
  4. Int64 Int64 Int64 Int64
  5. ├─────┼───────┼───────┼───────┤───────
  6. 1 1 1 4 4
  7. 2 2 2 3 6
  8. 3 3 1 2 4
  9. 4 4 2 1 6

答案1

得分: 4

将此功能添加讨论了一下,但我认为需要几个月才能发布(一般的想法是允许select具有groupby关键字参数,并添加transform函数,其工作方式类似于select但保留源数据帧的列)。

目前的解决方法是在by之后使用join

  1. join(mydf, by(mydf, :b, x1 = :a => sum), on=:b)
英文:

Adding this functionality is discussed, but I would say that it will take several months to be shipped (the general idea is to allow select to have groupby keyword argument + also add transform function that will work like select but preserve columns of the source data frame).

For now the solution is to use join after by:

  1. join(mydf, by(mydf, :b, x1 = :a => sum), on=:b)

huangapple
  • 本文由 发表于 2020年1月6日 23:54:34
  • 转载请务必保留本文链接:https://go.coder-hub.com/59615138.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定