R data.table按组动态列名返回新表格

huangapple go评论116阅读模式
英文:

R data.table dynamic column name of group by returning new table

问题

默认情况下,对数据表执行分组操作会返回一个新的数据表,其中包含一个自动命名的列 V1

dt <- data.table(a = sample(1:100, 100), b = sample(1:100, 100), id = rep(1:10,10))
dt[, mean(a), by = id]

#     id V1
# 1:  1 48.2
# 2:  2 47.9
# 3:  3 46.8
# 4:  4 54.7
# 5:  5 63.7
# 6:  6 50.6
# 7:  7 43.3
# 8:  8 52.7
# 9:  9 45.4
# 10: 10 51.7

根据这篇帖子,我可以这样设置结果列的名称:

dt[, list(mean = mean(a)), by = id]

是否可以为列名使用一个变量?例如,不显式设置 mean,而是像这样做:

column_name <- "mean"
dt[, list(column_name = mean(a)), by = id]  # 结果列名为 column_name(而不是 mean)
英文:

By default a group by operation on a data.table returns a new data.table with an automatically named column V1:

dt &lt;- data.table(a = sample(1:100, 100), b = sample(1:100, 100), id = rep(1:10,10))
dt[, mean(a), by = id]

#     id V1
# 1:  1 48.2
# 2:  2 47.9
# 3:  3 46.8
# 4:  4 54.7
# 5:  5 63.7
# 6:  6 50.6
# 7:  7 43.3
# 8:  8 52.7
# 9:  9 45.4
# 10: 10 51.7

Following this post I can set the name of the column with the results like so

dt[, list(mean = mean(a)), by = id]

Is it possible to have a variable for the column name? E.g., instead of setting mean explicitly I would like to do something like

column_name &lt;- &quot;mean&quot;
dt[, list(column_name = mean(a)), by = id]  # resulting column name is column_name (and not mean)

答案1

得分: 1

我们可以使用 setNames 函数。

library(data.table)
dt[, setNames(list(mean(a)), column_name), by = id]

#    id mean
# 1:  1 56.8
# 2:  2 50.5
# 3:  3 50.5
# 4:  4 42.4
# 5:  5 49.9
# 6:  6 47.8
# 7:  7 60.6
# 8:  8 57.4
# 9:  9 54.6
#10: 10 34.5

数据

set.seed(123)
dt <- data.table(a = sample(1:100, 100), b = sample(1:100, 100), id = rep(1:10,10))
column_name <- "mean"
英文:

We can use setNames

library(data.table)
dt[, setNames(list(mean(a)), column_name), by = id]

#    id mean
# 1:  1 56.8
# 2:  2 50.5
# 3:  3 50.5
# 4:  4 42.4
# 5:  5 49.9
# 6:  6 47.8
# 7:  7 60.6
# 8:  8 57.4
# 9:  9 54.6
#10: 10 34.5

data

set.seed(123)
dt &lt;- data.table(a = sample(1:100, 100), b = sample(1:100, 100), id = rep(1:10,10))
column_name &lt;- &quot;mean&quot;

答案2

得分: 1

我们可以使用data.table中的setnames函数。

library(data.table)
setnames(dt[, .(mean(a)), by = id], 'V1', column_name)[]
#    id mean
# 1:  1 56.8
# 2:  2 50.5
# 3:  3 50.5
# 4:  4 42.4
# 5:  5 49.9
# 6:  6 47.8
# 7:  7 60.6
# 8:  8 57.4
# 9:  9 54.6
#10: 10 34.5

数据

set.seed(123)
dt <- data.table(a = sample(1:100, 100), b = sample(1:100, 100), id = rep(1:10,10))
column_name <- "mean"
英文:

We can use setnames from data.table

library(data.table)
setnames(dt[, .(mean(a)), by = id], &#39;V1&#39;, column_name)[]
#    id mean
# 1:  1 56.8
# 2:  2 50.5
# 3:  3 50.5
# 4:  4 42.4
# 5:  5 49.9
# 6:  6 47.8
# 7:  7 60.6
# 8:  8 57.4
# 9:  9 54.6
#10: 10 34.5

###data

set.seed(123)
dt &lt;- data.table(a = sample(1:100, 100), b = sample(1:100, 100), id = rep(1:10,10))
column_name &lt;- &quot;mean&quot;

答案3

得分: 1

为了完整起见,您还可以部署一个返回命名列表的循环。例如,使用Map()

dt[
  , Map(
    function(i) {
      mean(a)
    }
    , i = "Mean"
  )
  , by = id
]

或者对于2个或更多函数调用/列:

dt[
  , Map(
    function(i, fun) {
      do.call(
        fun
        , list(a)
      )
    }
    , i = c("Mean", "SD")
    , fun = c(mean, sd)
  )
  , by = id
]
#     id Mean       SD
#  1:  1 56.8 29.23012
#  2:  2 50.5 26.18842
#  3:  3 50.5 24.82047
#  4:  4 42.4 34.72495
#  5:  5 49.9 26.99979
#  6:  6 47.8 28.35411
#  7:  7 60.6 31.52142
#  8:  8 57.4 32.22904
#  9:  9 54.6 27.90141
# 10: 10 34.5 30.94529
英文:

For the sake of completeness, you could also deploy a loop that returns a named list. For example, using Map():

dt[
  , Map(
    function(i) {
      mean(a)
    }
    , i = &quot;Mean&quot;
  )
  , by = id
]

Or for 2+ function calls/columns:

dt[
  , Map(
    function(i, fun) {
      do.call(
        fun
        , list(a)
      )
    }
    , i = c(&quot;Mean&quot;, &quot;SD&quot;)
    , fun = c(mean, sd)
  )
  , by = id
]
#     id Mean       SD
#  1:  1 56.8 29.23012
#  2:  2 50.5 26.18842
#  3:  3 50.5 24.82047
#  4:  4 42.4 34.72495
#  5:  5 49.9 26.99979
#  6:  6 47.8 28.35411
#  7:  7 60.6 31.52142
#  8:  8 57.4 32.22904
#  9:  9 54.6 27.90141
# 10: 10 34.5 30.94529

huangapple
  • 本文由 发表于 2020年1月3日 17:44:16
  • 转载请务必保留本文链接:https://go.coder-hub.com/59576235.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定