英文:
Return multiple rows per group in data.table
问题
可以通过在 data.table
中使用以下代码来实现类似 reframe
的功能:
library(data.table)
dt[, .(x = setdiff(x, y)), by = g]
英文:
Is it possible to return multiple rows per group in a grouped command in data.table
? In dplyr
, this is done with reframe
:
y <- c("a", "b", "d", "f")
df <- tibble(
g = c(1, 1, 1, 2, 2, 2, 2),
x = c("e", "a", "b", "e", "f", "c", "a")
)
library(dplyr)
df %>%
reframe(x = setdiff(x, y), .by = g)
# g x
# 1 e
# 2 e
# 2 c
In data.table
, this returns an error:
library(data.table)
dt <- setDT(df)
dt[, x := setdiff(x, y), g]
> Error in [.data.table
(df, , :=
(x, intersect(x, y)), g) :
> Supplied 2 items to be assigned to group 1 of size 3 in column 'x'.
> The RHS length must either be 1 (single values are ok) or match the
> LHS length exactly. If you wish to 'recycle' the RHS please use rep()
> explicitly to make this intent clear to readers of your code.
Anyway to get a data.table
equivalent of reframe
?
答案1
得分: 6
Wrap in .(...)
并且在 .(..)
内部使用 =
替代 :=
。
as.data.table(df)[, .(x = setdiff(x, y)), by = g]
# g x
# <num> <char>
# 1: 1 e
# 2: 2 e
# 3: 2 c
请注意,在底层,.(.)
实际上就是 list(.)
,所以我们也可以使用任何返回类似 list
的对象的方法,包括:
as.data.table(df)[, list(x = setdiff(x, y)), by = g]
as.data.table(df)[, data.table(x = setdiff(x, y)), by = g]
as.data.table(df)[, data.frame(x = setdiff(x, y)), by = g]
英文:
Wrap in .(...)
and use =
in place of :=
(because it's within .(..)
).
as.data.table(df)[, .(x = setdiff(x, y)), by = g]
# g x
# <num> <char>
# 1: 1 e
# 2: 2 e
# 3: 2 c
Note that under the hood, .(.)
is really just list(.)
, so we could also use anything that returns list
-like objects, including:
as.data.table(df)[, list(x = setdiff(x, y)), by = g]
as.data.table(df)[, data.table(x = setdiff(x, y)), by = g]
as.data.table(df)[, data.frame(x = setdiff(x, y)), by = g]
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论