sapply无法简化为向量

huangapple go评论65阅读模式
英文:

sapply won't simplify to vector

问题

我无法使用 `sapply()` 将我的 `list` 简化为一个 `vector`,这正是我认为它应该做的事情。这是预期行为吗?如果是的话,我有什么替代方法可以得到一个 `vector`
``` r
# 示例列表
l1 <- list(1:2, 3)

# 返回的是列表而不是向量
sapply(l1, cumsum, simplify = 'vector')
#> [[1]]
#> [1] 1 3
#> 
#> [[2]]
#> [1] 3

在 2023-06-28 使用 reprex v2.0.2 创建


<details>
<summary>英文:</summary>

I can&#39;t get `sapply()` to `simplify` my `list` to a `vector` which is what I thought it was supposed to do. Is this expected behavior and if so, what alternative do I have to end up with a `vector`?

``` r
# sample list
l1 &lt;- list(1:2, 3)

# returns list not vector
sapply(l1, cumsum, simplify = &#39;vector&#39;)
#&gt; [[1]]
#&gt; [1] 1 3
#&gt; 
#&gt; [[2]]
#&gt; [1] 3

<sup>Created on 2023-06-28 with reprex v2.0.2</sup>

答案1

得分: 0

这是预期的和已记录的行为。在你的示例中,cumsum() 返回具有不同 length 的值,无法简化在一起。
根据文档,只有当 X 的所有元素的长度相同时,sapply() 才会进行简化。

来自 ?sapply 的 'Details' 部分:

"Simplify" 在 sapply 中只有在 X 的长度大于零且来自 X 的所有元素的返回值都具有相同的(正数)长度时才会尝试。

# 不同长度的元素
l1 <- list(1:2, 3)

# 所有元素具有相同的长度
l2 <- list(1, 2, 3)

# 无法返回向量,因为长度不同
sapply(l1, cumsum)
#> [[1]]
#> [1] 1 3
#> 
#> [[2]]
#> [1] 3

# 返回向量
sapply(l2, cumsum)
#> [1] 1 2 3

在这种情况下,你可以使用 unlist() 来将输出转换为一个 vector

sapply(l1, cumsum) |> unlist()
#> [1] 1 3 3

此外,sapply() 实际上是 lapply(<list>, cumsum) |> simplify2array() 的底层实现。如果无法进行简化,simplify2array 会产生轻微的开销。因此,对于取消列表化,直接使用 lapply(<list>) |> unlist() 更有效率。

最后,尽管文档对 simplify 参数的可接受的 character 值并不完全明确:

simplify 逻辑值或字符串;如果可能的话,结果是否应简化为向量、矩阵或更高维数组?对于 sapply,它必须被命名,不能缩写。默认值 TRUE,如果合适的话,返回向量或矩阵,而如果 simplify = "array",结果可能是一个“rank”(= = length(dim(.))) 高于 FUN(X[[i]]) 结果的 array

我认为 simplify 的任何字符输入除了 &quot;array&quot; 外都没有任何意义。我认为除了 TRUEFALSE&quot;array&quot; 之外,我从未见过在那里使用任何内容。

英文:

This is expected and documented behavior. In your example, the cumsum() returns values of different lengths which cannot be simplified together.
Per the documentation, sapply() will only simplify if all elements are of same length.

From 'Details' section of ?sapply

> Simplification in sapply is only attempted if X has length greater than zero and if the return values from all elements of X are all of the same (positive) length.

# elements of different length
l1 &lt;- list(1:2, 3)

# all elements same length
l2 &lt;- list(1, 2, 3)

# cannot return vector if different lengths
sapply(l1, cumsum)
#&gt; [[1]]
#&gt; [1] 1 3
#&gt; 
#&gt; [[2]]
#&gt; [1] 3

# returns vector
sapply(l2, cumsum)
#&gt; [1] 1 2 3

<sup>Created on 2023-07-05 with reprex v2.0.2</sup>

In this case you can just use unlist() to convert the output to a vector.

sapply(l1, cumsum) |&gt; unlist()
#&gt; [1] 1 3 3

<sup>Created on 2023-07-05 with reprex v2.0.2</sup>

Furthermore, sapply() is actually lapply(&lt;list&gt;, cumsum) |&gt; simplify2array() under the hood. If simplification is not possible, simplify2array produces a slight overhead. So for unlisting, lapply(&lt;list&gt;) |&gt; unlist() directly is more efficient.

Finally, although the documentation isn't completely explicit about acceptable character values of the simplify argument:

> simplify
logical or character string; should the result be simplified to a vector, matrix or higher dimensional array if possible? For sapply it must be named and not abbreviated. The default value, TRUE, returns a vector or matrix if appropriate, whereas if simplify = &quot;array&quot; the result may be an array of “rank” (

=length(dim(.))) one higher than the result of FUN(X[[i]]).

I don't think any character input for simplify has any meaning other than &quot;array&quot;. I don't think I've ever seen anything used there besides TRUE, FALSE or &quot;array&quot;.

huangapple
  • 本文由 发表于 2023年6月29日 10:26:28
  • 转载请务必保留本文链接:https://go.coder-hub.com/76577714.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定