R:验证抛硬币结果

huangapple go评论66阅读模式
英文:

R: Verifying The Results Of Coin Flips

问题

Here is the translated portion of your text:

"我正在使用R编程语言工作。

假设我有以下问题:

  • 有一枚硬币,如果它正面朝上,那么下一次翻转为正面的概率为0.6(如果是反面,下一次翻转也是0.6)
  • 一个班级里有100名学生
  • 每个学生随机翻转这枚硬币的次数
  • 第n个学生的最后一次翻转不会影响第n+1个学生的第一次翻转(即,当下一个学生翻转硬币时,第一次翻转有0.5的正反面概率,但这个学生的下一次翻转取决于上一次翻转)

我试图编写R代码来表示这个问题。

首先,我定义了变量:

library(dplyr)
library(stringr)

# 生成数据
set.seed(123)
ids <- 1:100
student_id <- sample(ids, 100000, replace = TRUE)
coin_result <- character(1000)
coin_result[1] <- sample(c("H", "T"), 1)

接下来,我尝试编写翻转过程:

for (i in 2:length(coin_result)) {
  if (student_id[i] != student_id[i-1]) {
    coin_result[i] <- sample(c("H", "T"), 1)
  } else if (coin_result[i-1] == "H") {
    coin_result[i] <- sample(c("H", "T"), 1, prob = c(0.6, 0.4))
  } else {
    coin_result[i] <- sample(c("H", "T"), 1, prob = c(0.4, 0.6))
  }
}

# 整理数据
my_data <- data.frame(student_id, coin_result)
my_data <- my_data[order(my_data$student_id),]

最后,我尝试验证结果:

my_data %>%
  group_by(student_id) %>%
  summarize(Sequence = str_c(coin_result, lead(coin_result)), .groups = 'drop') %>%
  filter(!is.na(Sequence)) %>%
  count(Sequence)

尽管代码运行了,但我不认为我的代码是正确的 - 当我看结果时:

# A tibble: 4 x 2
  Sequence     n
  <chr>    <int>
1 HH       23810
2 HT       25043
3 TH       25042
4 TT       26005

我认为如果我是正确的,HH 应该明显大于 HT,而 TT 应该明显大于 TH。

请问有人可以告诉我是否我做对了,以及如何纠正它吗?

谢谢!

英文:

I am working with the R programming language.

Suppose I have the following problem:

  • There is a coin where if it lands head then the probability of the next flip being heads is 0.6 (and if tails then the next flip being tails is also 0.6)
  • There are 100 students in a class
  • Each student flips this coin a random number of times
  • The last flip of student_n does not influence the first flip of student_n+1 (i.e. when the next student flips the coin, the first flip has 0.5 probability of heads or tails, but the next flip for this student depends on the previous flip)

I am trying to write R code to represent this problem.

First I defined the variables:

library(dplyr)
library(stringr)

# generate data
set.seed(123)
ids &lt;- 1:100
student_id &lt;- sample(ids, 100000, replace = TRUE)
coin_result &lt;- character(1000)
coin_result[1] &lt;- sample(c(&quot;H&quot;, &quot;T&quot;), 1)

Next, I tried to write the flipping process:

for (i in 2:length(coin_result)) {
  if (student_id[i] != student_id[i-1]) {
    coin_result[i] &lt;- sample(c(&quot;H&quot;, &quot;T&quot;), 1)
  } else if (coin_result[i-1] == &quot;H&quot;) {
    coin_result[i] &lt;- sample(c(&quot;H&quot;, &quot;T&quot;), 1, prob = c(0.6, 0.4))
  } else {
    coin_result[i] &lt;- sample(c(&quot;H&quot;, &quot;T&quot;), 1, prob = c(0.4, 0.6))
  }
}

#tidy up
my_data &lt;- data.frame(student_id, coin_result)
my_data &lt;- my_data[order(my_data$student_id),]

Finally, I tried to verify the results:

my_data %&gt;%
  group_by(student_id) %&gt;%
  summarize(Sequence = str_c(coin_result, lead(coin_result)), .groups = &#39;drop&#39;) %&gt;%
  filter(!is.na(Sequence)) %&gt;%
  count(Sequence)

Even though the code ran, I don't think my code is correct - when I look at the results:

# A tibble: 4 x 2
  Sequence     n
  &lt;chr&gt;    &lt;int&gt;
1 HH       23810
2 HT       25043
3 TH       25042
4 TT       26005

I think if I was correct, HH should have been significantly greater than HT , and TT should have been significantly greater than TH.

Can someone please tell me if I have done this correctly and how to correct it?

Thanks!

答案1

得分: 1

I think you need to sort the student_id vector before the loop, so that your comparison of student_id[i] != student_id[i-1] would be valid. Otherwise, it's not catching consecutive flips from the same student.

结果似乎合理,其中HHTT一起占总翻转的60.4%。

library(tidyverse)

set.seed(123)
ids <- 1:100
# 仅以下一行已更改,所有其他行与您的代码相同
student_id <- sort(sample(ids, 100000, replace = TRUE))
coin_result <- character(1000)
coin_result[1] <- sample(c("H", "T"), 1)

for (i in 2:length(coin_result)) {
  if (student_id[i] != student_id[i-1]) {
    coin_result[i] <- sample(c("H", "T"), 1)
  } else if (coin_result[i-1] == "H") {
    coin_result[i] <- sample(c("H", "T"), 1, prob = c(0.6, 0.4))
  } else {
    coin_result[i] <- sample(c("H", "T"), 1, prob = c(0.4, 0.6))
  }
}

# 整理数据
my_data <- data.frame(student_id, coin_result)
my_data <- my_data[order(my_data$student_id),]

my_data %>%
  group_by(student_id) %>%
  summarize(Sequence = str_c(coin_result, lead(coin_result)), .groups = 'drop') %>%
  filter(!is.na(Sequence)) %>%
  count(Sequence)

# 一个tibble: 4 × 2
  Sequence     n
1 HH       29763
2 HT       19782
3 TH       19775
4 TT       30580
英文:

I think you need to sort the student_id vector before the loop, so that your comparison of student_id[i] != student_id[i-1] would be valid. Otherwise, it's not catching consecutive flips from the same student.

The result seems to make sense, where HH and TT together occupies 60.4% of the total flips.

library(tidyverse)

set.seed(123)
ids &lt;- 1:100
# only the following line was changed, all other lines are same as your code
student_id &lt;- sort(sample(ids, 100000, replace = TRUE))
coin_result &lt;- character(1000)
coin_result[1] &lt;- sample(c(&quot;H&quot;, &quot;T&quot;), 1)

for (i in 2:length(coin_result)) {
  if (student_id[i] != student_id[i-1]) {
    coin_result[i] &lt;- sample(c(&quot;H&quot;, &quot;T&quot;), 1)
  } else if (coin_result[i-1] == &quot;H&quot;) {
    coin_result[i] &lt;- sample(c(&quot;H&quot;, &quot;T&quot;), 1, prob = c(0.6, 0.4))
  } else {
    coin_result[i] &lt;- sample(c(&quot;H&quot;, &quot;T&quot;), 1, prob = c(0.4, 0.6))
  }
}

#tidy up
my_data &lt;- data.frame(student_id, coin_result)
my_data &lt;- my_data[order(my_data$student_id),]

my_data %&gt;%
  group_by(student_id) %&gt;%
  summarize(Sequence = str_c(coin_result, lead(coin_result)), .groups = &#39;drop&#39;) %&gt;%
  filter(!is.na(Sequence)) %&gt;%
  count(Sequence)

# A tibble: 4 &#215; 2
  Sequence     n
  &lt;chr&gt;    &lt;int&gt;
1 HH       29763
2 HT       19782
3 TH       19775
4 TT       30580

huangapple
  • 本文由 发表于 2023年5月7日 11:14:12
  • 转载请务必保留本文链接:https://go.coder-hub.com/76192042.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定