使用列中的值作为求和的上限来创建新列

huangapple go评论63阅读模式
英文:

Use Value in Column as Upper Limit of Summation to Create a New Column

问题

我试图在一个R数据表中创建一个新列,该列对从另一个列中的值累加求和。这里有一个例子:

| x(现有列) | y(新列) |
|---|---|
|2|3|
|3|6|
|4|10|
|5|15|

在表的第一行中,我们通过对12进行求和(1+2=3)来得到y,在第二行中,我们对13进行求和(1+2+3=6),依此类推。在实际问题中,x中的值可以是任何整数(即它们不是有序的,也不可预测)。

我尝试了一些不同的方法,从非常天真的方法开始:
`dt[,y:=sum(c(1:x))]`,这不出奇地会产生关于仅使用了x的一些元素的警告,因为它(我相信)试图将x作为列使用,而不是在x的每一行中使用变量。我尝试使用`lapply`来处理每一行,但也无法使其正常工作。

显而易见的解决方案,我目前已经实现了,那就是循环遍历每一行,这样问题就变得很简单了。然而,我觉得一定有一个更加流畅的data.table解决方案存在。
英文:

I'm trying to create a new column in an R data.table that sums from 1 to the value in a different column. Here's an example:

x (existing column) y (new column)
2 3
3 6
4 10
5 15

In the first row of the table we get y by summing 1 through 2 (1+2=3), in the second we sum 1 through 3 (1+2+3=6), etc. In the real problem the values in x can be any integer (i.e. they're not in order and not predictable).

I've tried a few different things, starting with the very naive approach:
dt[,y:=sum(c(1:x))] which, unsurprisingly, gives warnings about only using some elements of x, since it's (I believe) trying to use x as a column instead of using the variable in each row of x. I tried using lapply to work on each row, but I can't get that to work either.

The obvious solution, which I've implemented for now, is to loop through each row, in which case the problem is trivial. However, I feel like there must be a more streamlined data.table solution somewhere.

答案1

得分: 2

你正在尝试的是有效执行 1:(2:5),而 : 需要每一侧都有一个单独的数字(因此它只采用第一个数字;2)。

以下是可能有效的替代方法:

dt[, y := sapply(x, \(y) sum(1:y))]
dt[, y := (x*(x+1L)) %/% 2L]

可重现的数据:

dt <- data.table(x = 2L:5L)
英文:

What you are trying to do is effectively 1:(2:5) while : is expecting a single number on each side (so it just takes the first; 2).

Here are alternatives that should work:

dt[, y := sapply(x, \(y) sum(1:y))]


dt[, y := (x*(x+1L)) %/% 2L]

Reproducible data:

dt &lt;- data.table(x = 2L:5L)

答案2

得分: 0

A [tag:tidyverse] solution:

library(tidyverse)

df <- tibble(x = 2:5)

df %>%
  mutate(y = map_dbl(x, ~sum(1:.x)))

# A tibble: 4 × 2
      x     y
  <int> <dbl>
1     2     3
2     3     6
3     4    10
4     5    15

(Note: The code section has not been translated, only the surrounding text has been translated.)

英文:

A [tag:tidyverse] solution:

library(tidyverse)

df &lt;- tibble(x = 2:5)

df %&gt;% 
  mutate(y = map_dbl(x, ~sum(1:.x)))

# A tibble: 4 &#215; 2
      x     y
  &lt;int&gt; &lt;dbl&gt;
1     2     3
2     3     6
3     4    10
4     5    15

huangapple
  • 本文由 发表于 2023年7月4日 22:03:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/76613441.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定