如何在R中连续连接整数向量。

huangapple go评论59阅读模式
英文:

How to consecutively concatenate a vector of integers in R

问题

    vec <- c(1, 3, 2, 37)

我想要按顺序连接这个向量,使输出看起来类似于这样:

    > output
    [[1]]
    [1] 1
    
    [[2]]
    [1] 1 3
    
    [[3]]
    [1] 1 3 2
    
    [[4]]
    [1] 1 3 2 37


我写了一个函数来做到这一点,但它没有给我正确的输出:

    myfun <- function(vec){
      output = vector("list", length(vec))
      output[[1]] = vec[1]
      for(i in 2:length(vec)){
        output[[i]] = paste(output[[i - 1]], vec[i])
        output[[i]] = as.numeric(strsplit(output[[i]], " ")[[1]])
      }
      return(output)
    }
    > myfun(c(1, 3, 2, 37))
    [[1]]
    [1] 1
    
    [[2]]
    [1] 1 3
    
    [[3]]
    [1] 1 2
    
    [[4]]
    [1]  1 37
英文:
vec &lt;- c(1, 3, 2, 37)

I want to consecutively concatenate this vector such that the output looks something like this:

&gt; output
[[1]]
[1] 1

[[2]]
[1] 1 3

[[3]]
[1] 1 3 2

[[4]]
[1] 1 3 2 37

I wrote a function to do this, but it didn't give me the correct output:

myfun &lt;- function(vec){
  output = vector(&quot;list&quot;, length(vec))
  output[[1]] = vec[1]
  for(i in 2:length(vec)){
    output[[i]] = paste(output[[i - 1]], vec[i])
    output[[i]] = as.numeric(strsplit(output[[i]], &quot; &quot;)[[1]])
  }
  return(output)
}
&gt; myfun(c(1, 3, 2, 37))
[[1]]
[1] 1

[[2]]
[1] 1 3

[[3]]
[1] 1 2

[[4]]
[1]  1 37

答案1

得分: 5

A direct way to do this would be Reduce:

Reduce(f = c, x = vec, accumulate = TRUE)

There's a purrr::accumulate function that will accomplish the same thing:

purrr::accumulate(.x = vec, .f = c, .simplify = FALSE)

(Edited to incorporate the comment to just use c() as the function, much simpler.)

英文:

A direct way to do this would be Reduce:

Reduce(f = c,x = vec,accumulate = TRUE)

There's a purrr::accumulate function what will accomplish the same thing:

purrr::accumulate(.x = vec,.f = c,.simplify = FALSE)

(Edited to incorporate the comment to just use c() as the function, much simpler.)

答案2

得分: 4

我们可以使用 `lapply` 和 `head`(或 `[`) 来实现:

```r
lapply(seq_along(vec), head, x = vec)
# [[1]]
# [1] 1
# [[2]]
# [1] 1 3
# [[3]]
# [1] 1 3 2
# [[4]]
# [1]  1  3  2 37
  • seq_along(vec) 类似于 1:length(vec)(我们也可以使用这个,但在一些特殊情况下,seq_along 更安全);
  • lapply 被传递一个函数时,它通常会调用它一次,第一个参数包含值(这将是 1 到 4,依次递增);由于我们包括了 x=vec(这是 head 的第一个参数),因此 lapply 将数字作为下一个参数应用于 head,这恰好是 n=

我们也可以使用 lapply(seq_along(vec), function(z) vec[1:z]) 来实现。


编辑:后者(vec[1:z] 的实现)比使用 head 要快得多,我应该知道这一点。

bench::mark(
  a1=lapply(seq_along(vec), head, x = vec),
  a2=lapply(seq_along(vec), function(z) vec[1:z]), 
  a3=lapply(1:length(vec), function(z) vec[1:z]),
  b=Reduce(f = c,x = vec,accumulate = TRUE),
  iterations = 100000)
# # 一个 tibble: 4 × 13
#   expression      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result     memory     time      
#   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list>     <list>     <list>    
# 1 a1          11.51µs  16.14µs    53800.        0B     3.23 99994     6      1.86s <list [4]> <Rprofmem> <bench_tm>
# 2 a2           2.95µs   4.06µs   215605.        0B     4.31 99998     2    463.8ms <list [4]> <Rprofmem> <bench_tm>
# 3 a3           3.04µs   4.01µs   221482.        0B     2.21 99999     1    451.5ms <list [4]> <Rprofmem> <bench_tm>
# 4 b            3.42µs    4.3µs   209810.        0B     4.20 99998     2   476.61ms <list [4]> <Rprofmem> <bench_tm>
# # ℹ 1 more variable: gc <list>
英文:

We can use lapply and head (or [) for this:

lapply(seq_along(vec), head, x = vec)
# [[1]]
# [1] 1
# [[2]]
# [1] 1 3
# [[3]]
# [1] 1 3 2
# [[4]]
# [1]  1  3  2 37
  • seq_along(vec) is analogous to 1:length(vec) (and we can use that too, but seq_along is safer in corner-cases);
  • when lapply is given a function, it normally calls it once with the first argument containing the value (which will be 1 through 4, consecutively); since we include x=vec (which is the first argument of head), then lapply applies the number as the next argument to head, which happens to be n=.

We could also have done lapply(seq_along(vec), function(z) vec[1:z]).


Edit: the latter (vec[1:z] implementation) is significantly faster than using head, I should have known that.

bench::mark(
  a1=lapply(seq_along(vec), head, x = vec),
  a2=lapply(seq_along(vec), function(z) vec[1:z]), 
  a3=lapply(1:length(vec), function(z) vec[1:z]),
  b=Reduce(f = c,x = vec,accumulate = TRUE),
  iterations = 100000)
# # A tibble: 4 &#215; 13
#   expression      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result     memory     time      
#   &lt;bch:expr&gt; &lt;bch:tm&gt; &lt;bch:tm&gt;     &lt;dbl&gt; &lt;bch:byt&gt;    &lt;dbl&gt; &lt;int&gt; &lt;dbl&gt;   &lt;bch:tm&gt; &lt;list&gt;     &lt;list&gt;     &lt;list&gt;    
# 1 a1          11.51&#181;s  16.14&#181;s    53800.        0B     3.23 99994     6      1.86s &lt;list [4]&gt; &lt;Rprofmem&gt; &lt;bench_tm&gt;
# 2 a2           2.95&#181;s   4.06&#181;s   215605.        0B     4.31 99998     2    463.8ms &lt;list [4]&gt; &lt;Rprofmem&gt; &lt;bench_tm&gt;
# 3 a3           3.04&#181;s   4.01&#181;s   221482.        0B     2.21 99999     1    451.5ms &lt;list [4]&gt; &lt;Rprofmem&gt; &lt;bench_tm&gt;
# 4 b            3.42&#181;s    4.3&#181;s   209810.        0B     4.20 99998     2   476.61ms &lt;list [4]&gt; &lt;Rprofmem&gt; &lt;bench_tm&gt;
# # ℹ 1 more variable: gc &lt;list&gt;

答案3

得分: 1

我无法想到比Reduce(由@joran)或lapply(由@r2evans)更好的解决方案,它们已经足够高效和简洁。


以下是另一个基本的R选项,只是为了好玩

&gt; split(vec[(k &lt;- sequence(seq_along(vec)))], cumsum(k == 1))
$`1`
[1] 1

$`2`
[1] 1 3

$`3`
[1] 1 3 2

$`4`
[1]  1  3  2 37
英文:

I cannot think of better solutions than Reduce (by @joran) or lapply (by @r2evans), which are already sufficiently efficient and concise.


Here is another base R option but just for fun

&gt; split(vec[(k &lt;- sequence(seq_along(vec)))], cumsum(k == 1))
$`1`
[1] 1

$`2`
[1] 1 3

$`3`
[1] 1 3 2

$`4`
[1]  1  3  2 37

答案4

得分: 0

如果这样做的原因是为了以后可以迭代列表,例如计算总和:

L <- list(vec[1], vec[1:2], vec[1:3], vec[1:4])
sapply(L, sum)
## [1]  1  4  6 43

那么我们可以通过使用rollapplyr来避免首先创建L:

library(zoo)

rollapplyr(vec, seq_along(vec), sum)  # 相同结果但没有中间列表L
## [1]  1  4  6 43

注意

从问题中提取的vec:

vec <- c(1, 3, 2, 37)
英文:

If the reason to do this is so that you can iterate over the list later taking, for example, sums

L &lt;- list(vec[1], vec[1:2], vec[1:3], vec[1:4])
sapply(L, sum)
## [1]  1  4  6 43

then we can avoid creating L in the first place by using rollapplyr:

library(zoo)

rollapplyr(vec, seq_along(vec), sum)  # same but no intermediate L
## [1]  1  4  6 43

Note

vec taken from the question

vec &lt;- c(1, 3, 2, 37)

答案5

得分: 0

另一种使用base R的方法,仅为了完整性而使用matrix

mat <- matrix(vec, nrow=length(vec), ncol=length(vec), byrow=T)
mat[upper.tri(mat)] <- NA
apply(mat, 1, \(x) as.vector(na.omit(x)))
[[1]]
[1] 1

[[2]]
[1] 1 3

[[3]]
[1] 1 3 2

[[4]]
[1]  1  3  2 37
英文:

Another base R approach using matrix just for completeness

mat &lt;- matrix(vec, nrow=length(vec), ncol=length(vec), byrow=T)
mat[upper.tri(mat)] &lt;- NA
apply(mat, 1, \(x) as.vector(na.omit(x)))
[[1]]
[1] 1

[[2]]
[1] 1 3

[[3]]
[1] 1 3 2

[[4]]
[1]  1  3  2 37

huangapple
  • 本文由 发表于 2023年6月5日 04:25:22
  • 转载请务必保留本文链接:https://go.coder-hub.com/76402276.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定