data.table的排序结果与基本R不一致,导致意外结果。

huangapple go评论151阅读模式
英文:

data.table ordering gives unexpected results inconsistent with base R

问题

我注意到了一些与 data.table 有关的奇怪行为。似乎涉及到反引号 (`) 的使用。我已经准备了一个简单的示例:

library(data.table)

data.table(v1 = c("2", "1", "10"))[order(v1)]
#>    v1
#> 1:  1
#> 2: 10
#> 3:  2

data.table(v1 = c("`2`", "`1`", "`10`"))[order(v1)]
#>      v1
#> 1: `10`
#> 2:  `1`
#> 3:  `2`

c("2", "1", "10")[order(c("2", "1", "10"))]
#> [1] "1"  "10" "2"

c("`2`", "`1`", "`10`")[order(c("`2`", "`1`", "`10`"))]
#> [1] "`1`"  "`10`" "`2`"

创建于2023-05-25,使用 reprex v2.0.2

正如您所看到的,使用反引号会改变行的顺序,而基本的 R 不会显示这种行为(给出了预期的结果)。

这是一个 bug 还是一个特性?

英文:

I have noticed some odd behaviour with data.table. It seems to involve the use of backticks (`). I have prepared a simple example here:

library(data.table)

data.table(v1 = c("2", "1", "10"))[order(v1)]
#>    v1
#> 1:  1
#> 2: 10
#> 3:  2

data.table(v1 = c("`2`", "`1`", "`10`"))[order(v1)]
#>      v1
#> 1: `10`
#> 2:  `1`
#> 3:  `2`

c("2", "1", "10")[order(c("2", "1", "10"))]
#> [1] "1"  "10" "2"

c("`2`", "`1`", "`10`")[order(c("`2`", "`1`", "`10`"))]
#> [1] "`1`"  "`10`" "`2`"

<sup>Created on 2023-05-25 with reprex v2.0.2</sup>

As you can see, using backticks changes the order of rows. Whereas base R doesn't show this behaviour (gives the expected result).

Is this a bug or a feature?

答案1

得分: 1

这只是 base::orderdata.table::order 之间的差异。?data.table::order 手册中指出它们是不同的。

> 使用 C 语言环境使得 data.table 中排序的行为在会话和地区设置间更一致。base::order 的行为取决于 R 会话地区设置的假设。

data.table(v1 = c("2", "1", "10"))[ 按照(v1) 排序 ]

v1

1: 10

2: 1

3: 2

data.table(v1 = c("2", "1", "10"))[ base::order(v1) ]

v1

1: 1

2: 10

3: 2

英文:

This is just a difference between base::order and data.table::order. The manuals for ?data.table::order, says they are different.

> Using C-locale makes the behaviour of sorting in data.table more consistent across sessions and locales. The behaviour of base::order depends on assumptions about the locale of the R session.

data.table(v1 = c(&quot;`2`&quot;, &quot;`1`&quot;, &quot;`10`&quot;))[ order(v1) ]
#      v1
# 1: `10`
# 2:  `1`
# 3:  `2`

data.table(v1 = c(&quot;`2`&quot;, &quot;`1`&quot;, &quot;`10`&quot;))[ base::order(v1) ]
#      v1
# 1:  `1`
# 2: `10`
# 3:  `2`

huangapple
  • 本文由 发表于 2023年5月25日 15:51:32
  • 转载请务必保留本文链接:https://go.coder-hub.com/76330018.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定