‘rexp(1000, 1)’ 和 ‘replicate(1000, rexp(1,1))’ 在R中有什么区别?

huangapple go评论72阅读模式
英文:

What are the difference between 'rexp(1000, 1)' and 'replicate(1000, rexp(1,1))' in R?

问题

我尝试使用参数为1的指数分布生成1000个数字。

在将种子值设置为1后,我尝试了rexp(1000, 1)replicate(1000, rexp(1, 1))两种方法,但得到的两个向量的中位数不同。

我期望这两个表达式生成的向量是相同的,因为它们都是在相同的种子值下从相同的指数分布中抽样的。

rexp(1000, 1)replicate(1000, rexp(1, 1))之间有什么区别?在实际中应该使用哪个?

这是我尝试的代码:

> options(digits = 2)
> set.seed(1)
> 
> a <- rexp(1000, 1)
> b <- replicate(1000, rexp(1, 1))
> 
> median(a)
[1] 0.73
> median(b)
[1] 0.68

希望这有帮助。

英文:

I am trying to generate 1000 numbers using exponential distribution with parameter 1.

After setting the seed value to 1, I tried both rexp(1000, 1) and replicate(1000, rexp(1, 1)), but the medians of the resulting two vectors are different.

I expected the vectors generated by the two expressions to be the same, because they were both sampled from the same exponential distribution under the same seed value.

What is the difference between rexp(1000, 1) and replicate(1000, rexp(1, 1))? Which should I use in practice?

Here is the code that I tried:

&gt; options(digits = 2)  
&gt; set.seed(1)
&gt; 
&gt; a &lt;- rexp(1000, 1)   
&gt; b &lt;- replicate(1000, rexp(1, 1))
&gt; 
&gt; median(a)   
[1] 0.73   
&gt; median(b)   
[1] 0.68

答案1

得分: 5

问题在于在使用后随机种子会改变,因此当生成 b 时,您的种子为 1,与 a 不同。如果要使其与 a 相同,您必须在创建 b 之前重置种子。

set.seed(1)

a <- rexp(1000, 1)

set.seed(1)

b <- replicate(1000, rexp(1, 1))

median(a)
#> [1] 0.7346113

median(b)
#> [1] 0.7346113

至于应该使用哪一个,绝对是 rexp(1000, 1),因为这只生成一次对底层 C 代码的调用,而不是 1000 次调用。尽管从上面可以看出两个代码生成相同的结果,但简单的基准测试显示 rexp 大约快了 50 倍。

microbenchmark::microbenchmark(a = rexp(1000, 1), 
                               b = replicate(1000, rexp(1, 1)))
#> Unit: microseconds
#>  expr      min        lq       mean   median       uq       max neval cld
#>     a   32.501   33.5005   34.54794   34.101   34.701    42.301   100  a 
#>     b 1503.402 1539.0010 2043.20113 1569.451 1646.901 10051.202   100   b

创建于2023-02-27,使用 reprex v2.0.2

英文:

The problem here is that the random seed changes after it is used, so your seed of 1 is different when you generate b. You have to reset the seed before you create b if you want it to be the same as a

set.seed(1)

a &lt;- rexp(1000, 1)

set.seed(1)

b &lt;- replicate(1000, rexp(1, 1))

median(a)
#&gt; [1] 0.7346113

median(b)
#&gt; [1] 0.7346113

As for which you should use, it is definitely rexp(1000, 1), because this generates a single call to the underlying C code as opposed to 1000 calls. Although we can see from above that the two codes generate the same results, a simple benchmark shows that rexp is about 50 times faster.

microbenchmark::microbenchmark(a = rexp(1000, 1), 
                               b = replicate(1000, rexp(1, 1)))
#&gt; Unit: microseconds
#&gt;  expr      min        lq       mean   median       uq       max neval cld
#&gt;     a   32.501   33.5005   34.54794   34.101   34.701    42.301   100  a 
#&gt;     b 1503.402 1539.0010 2043.20113 1569.451 1646.901 10051.202   100   b

<sup>Created on 2023-02-27 with reprex v2.0.2</sup>

huangapple
  • 本文由 发表于 2023年2月27日 19:31:43
  • 转载请务必保留本文链接:https://go.coder-hub.com/75579914.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定