直到事件发生时翻转硬币

huangapple go评论61阅读模式
英文:

Flipping Coins Until Some Event Happens

问题

以下是翻译后的代码部分:

假设我有3枚硬币。
- 硬币1:正面概率=0.5,正面奖励=+1,反面奖励=-1
- 硬币2:正面概率=0.8,正面奖励=0.2,反面奖励=-0.1
- 硬币3:正面概率=0.3,正面奖励=+3,反面奖励=-5

现在,想象一个游戏,两名玩家(玩家1总是开始游戏,两名玩家从奖励=0开始)轮流翻转随机硬币,同时计分(玩家1总是开始):第一个玩家达到奖励=+10的玩家获胜。

**我的问题:** 我想模拟包含双方玩家的轮次详细信息,直到有一个赢家的5场比赛。

首先,我定义了硬币信息:

```R
# 定义硬币属性
硬币1 <- list(prob = 0.5, reward_heads = 1, reward_tails = -1)
硬币2 <- list(prob = 0.8, reward_heads = 0.2, reward_tails = -0.1)
硬币3 <- list(prob = 0.3, reward_heads = 3, reward_tails = -5)

接下来,我定义了一个翻转硬币的函数:

# 定义翻转硬币的函数
flip_coin <- function(coin) {
  if (runif(1) < coin$prob) {
    return(coin$reward_heads)
  } else {
    return(coin$reward_tails)
  }
}

然后,我定义了用于存储游戏结果的初始数据框:

game_log <- data.frame(turn_no = numeric(),
                       current_player_turn = character(),
                       coin_chosen = character(),
                       player_1_current_score = numeric(),
                       player_2_current_score = numeric(),
                       stringsAsFactors = FALSE)
player_names <- c("玩家1", "玩家2")
current_player <- player_names[1]
# 设置初始分数和游戏状态
player_scores <- c(0, 0)
game_state <- list(current_player = 1, game_log = game_log)

最后,我尝试编写一个函数来模拟单场比赛 - 这涉及步骤,例如:创建一个WHILE循环,直到任一玩家的累积分数> 10,选择每个轮次的随机硬币,翻转所选硬币,更新分数并在玩家之间轮换:

play_game <- function() {
  game_no = n
  # 玩游戏,直到一名玩家达到+10分
  while (max(player_scores) < 10) {
    # 随机选择一个硬币
    coin_choice <- sample(c("硬币1", "硬币2", "硬币3"), 1)
    
    # 翻转所选硬币
    coin_result <- flip_coin(get(coin_choice))
    
    # 更新当前玩家的分数
    player_scores[current_player] <- player_scores[current_player] + coin_result
    
    # 记录当前轮次的信息
    turn_info <- data.frame(turn_no = nrow(game_log) + 1,
                            current_player_turn = current_player,
                            coin_chosen = coin_choice, game_no = n,
                            player_1_current_score = player_scores[1],
                            player_2_current_score = player_scores[2])
    game_log <- rbind(game_log, turn_info)
    
    # 切换到另一名玩家的轮次
    current_player <- ifelse(current_player == 1, 2, 1)
  }
  
  # 返回游戏日志
  return(game_log)
}

但是我遇到了错误:

Error in while (max(player_scores) < 10) : 
  missing value where TRUE/FALSE needed

如果这个函数成功,我会像这样复制5场比赛:

n_iterations <- 5
result <- lapply(1:n_iterations, play_game, n = n_rows)
result <- do.call(rbind, result)

希望这对您有所帮助!

英文:

Imagine I have 3 coins .

  • Coin 1: Probability of Heads = 0.5, Reward Heads = +1, Reward Tails = -1
  • Coin 2: Probability of Heads = 0.8, Reward Heads = 0.2, Reward Tails = -0.1
  • Coin 3 Probability of Heads = 0.3 Reward Heads = + 3, Reward Tails = -5

Now, imagine a game where two players (player 1 always starts the game and both players start with reward = 0) take turns flipping random coins while tallying their score (player 1 always starts) : the first player to reach reward = + 10 wins.

My Question: I want to simulate 5 games that contain the turn-by-turn details of both players until a winner is reached.

First, I defined the coin information:

# Define coin properties
coin1 &lt;- list(prob = 0.5, reward_heads = 1, reward_tails = -1)
coin2 &lt;- list(prob = 0.8, reward_heads = 0.2, reward_tails = -0.1)
coin3 &lt;- list(prob = 0.3, reward_heads = 3, reward_tails = -5)

Next, I defined a function to flip a coin

# Define function to flip a coin 
flip_coin &lt;- function(coin) {
  if (runif(1) &lt; coin$prob) {
    return(coin$reward_heads)
  } else {
    return(coin$reward_tails)
  }
}

Then, I defined the initial data frame to store the game results:

game_log &lt;- data.frame(turn_no = numeric(), 
                       current_player_turn = character(), 
                       coin_chosen = character(),
                       player_1_current_score = numeric(), 
                       player_2_current_score = numeric(),
                       stringsAsFactors = FALSE)
player_names &lt;- c(&quot;player 1&quot;, &quot;player 2&quot;)
current_player &lt;- player_names[1]
# Set up initial scores and game state
player_scores &lt;- c(0, 0)
game_state &lt;- list(current_player = 1, game_log = game_log)

Finally, I tried to write a function to simulate a single game - this would involve steps such as: creating a WHILE LOOP until cumulative score of any player is > 10, select a random coin at each turn, flipping the selected coin, updating scores and alternating between players

play_game &lt;- function() {
game_no = n
  # Play the game until a player reaches a score of +10
  while (max(player_scores) &lt; 10) {
    # Choose a coin at random
    coin_choice &lt;- sample(c(&quot;coin1&quot;, &quot;coin2&quot;, &quot;coin3&quot;), 1)
    
    # Flip the chosen coin
    coin_result &lt;- flip_coin(get(coin_choice))
    
    # Update the current player&#39;s score
    player_scores[current_player] &lt;- player_scores[current_player] + coin_result
    
    # Log the current turn&#39;s information
    turn_info &lt;- data.frame(turn_no = nrow(game_log) + 1,
                            current_player_turn = current_player,
                            coin_chosen = coin_choice, game_no = n,
                            player_1_current_score = player_scores[1],
                            player_2_current_score = player_scores[2])
    game_log &lt;- rbind(game_log, turn_info)
    
    # Switch to the other player&#39;s turn
    current_player &lt;- ifelse(current_player == 1, 2, 1)
  }
  
  # Return the game log
  return(game_log)
}

But I get an error:

Error in while (max(player_scores) &lt; 10) { : 
  missing value where TRUE/FALSE needed

Had this worked, I would have then replicated 5 games like this:

n_iterations &lt;- 5
result &lt;- lapply(1:n_iterations, play_game, n = n_rows)
result &lt;- do.call(rbind, result)

Can someone please show me how to do this?

Thanks!

答案1

得分: 2

以下是您要翻译的部分:

主要问题我看到的是:

您尝试通过球员名称的字符串而不是整数来进行索引:

player_names <- c("player 1", "player 2")
current_player <- player_names[1]
...
player_scores[current_player] <- player_scores[current_player] + coin_result

这会导致player_scores看起来像这样:

          player 1 

0 0 NA

我确信这不是您的意图,因为向量中的任何NA都会使max()(或任何简单的统计函数)失败。

您可以使用which()-function来修复这个问题

n=1 # 1:10 测试
play_game <- function() {
game_no = n

current_player <- which(player_names==current_player) # 整数而不是字符串

或者可以在while循环开始之前初始化当前玩家:current_player <- 1

player_scores <- c(0, 0)

玩游戏直到有玩家得分达到+10

while (max(player_scores) < 10) {...}
}

但要小心,至少在我的测试中,数值会迅速变负,永远不会有积极的结果,导致无限循环。我已将一些内容更改为玩到-10:

定义硬币属性

coin1 <- list(prob = 0.5, reward_heads = 1, reward_tails = -1)
coin2 <- list(prob = 0.8, reward_heads = 0.2, reward_tails = -0.1)
coin3 <- list(prob = 0.3, reward_heads = 3, reward_tails = -5)

定义抛硬币函数

flip_coin <- function(coin) {
if (runif(1) < coin$prob) {
return(coin$reward_heads)
} else {
return(coin$reward_tails)
}
}

play_game <- function(n=NULL) { # 添加了“n”作为输入

初始化

player_names <- c("player 1", "player 2")

game_log <- data.frame(turn_no = numeric(),
current_player_turn = character(),
coin_chosen = character(),
player_1_current_score = numeric(),
player_2_current_score = numeric(),
stringsAsFactors = FALSE)

设置初始分数和游戏状态

game_state <- list(current_player = 1, game_log = game_log)

game_no = n

current_player <- 1 # which(player_names==current_player)

player_scores <- c(0, 0)

玩游戏直到有玩家得分达到-10 // 要小心这里!

while (min(player_scores) > -10) { # while (max(player_scores[1]) < 10 | max(player_scores[2]) < 10) {

# 随机选择一个硬币
coin_choice <- sample(c("coin1", "coin2", "coin3"), 1)

# 抛出所选的硬币
coin_result <- flip_coin(get(coin_choice))

# 更新当前玩家的分数
player_scores[current_player] <- player_scores[current_player] + coin_result

# 记录当前回合的信息
turn_info <- data.frame(turn_no = nrow(game_log) + 1,
                        current_player_turn = player_names[current_player],
                        coin_chosen = coin_choice, game_no = n,
                        player_1_current_score = player_scores[1],
                        player_2_current_score = player_scores[2])

game_log <- rbind(game_log, turn_info)

# 切换到另一个玩家的回合
current_player <- ifelse(current_player == 1, 2, 1)

}

返回游戏日志

return(game_log)
}

lapply(1:5, (x) play_game(x))

第一个结果:

[[1]]
turn_no current_player_turn coin_chosen game_no player_1_current_score player_2_current_score
1 1 player 1 coin1 1 1.0 0.0
2 2 player 2 coin1 1 1.0 1.0
3 3 player 1 coin2 1 0.9 1.0
4 4 player 2 coin1 1 0.9 2.0
5 5 player 1 coin2 1 1.1 2.0
6 6 player 2 coin2 1 1.1 2.2
7 7 player 1 coin2 1 1.3 2.2
8 8 player 2 coin2 1 1.3 2.4
9 9 player 1 coin1 1 2.3 2.4
10 10 player 2 coin3 1 2.3 -2.6
11 11 player 1 coin3 1 -2.7 -2.6
12 12 player 2 coin2 1 -2.7 -2.4
13 13 player 1 coin3 1 -7.7 -2.4
14 14 player 2 coin2 1 -7.7 -2.2
15 15 player 1 coin2 1 -7.5 -2.2
16 16 player 2 coin3 1 -7.5 -7.2
17 17 player 1 coin2 1 -7.3 -7.2
18 18 player 2 coin1 1 -7.3 -6.2
19 19 player 1 coin3 1 -4.3 -6.2
20 20 player 2 coin3 1 -4.3 -11.2

英文:

The main problem I see is:

# You are trying to index via strings of playernames instead of integers:
player_names &lt;- c(&quot;player 1&quot;, &quot;player 2&quot;)
current_player &lt;- player_names[1]
...
player_scores[current_player] &lt;- player_scores[current_player] + coin_result

This leads to player_scores looking like this:

              player 1 
   0        0       NA 

I'm sure this is not intended, since any NA within a vector will make max() (or any of the simple statisics function) fail.

You can fix this by using the which()-function

n=1 # 1:10 tested
play_game &lt;- function() {
  game_no = n
  
  current_player &lt;- which(player_names==current_player) # integer instead of string
  # or simply initialise current player before the start of the while loop with: current_player &lt;- 1 
  
  player_scores &lt;- c(0, 0)
  
  # Play the game until a player reaches a score of +10
  while (max(player_scores) &lt; 10) {...}

Careful though as, at least in my tests, the values get negative really fast and there will never be a positive outcome, leading to an endless loop.
I changed some things to play till -10:

# Define coin properties
coin1 &lt;- list(prob = 0.5, reward_heads = 1, reward_tails = -1)
coin2 &lt;- list(prob = 0.8, reward_heads = 0.2, reward_tails = -0.1)
coin3 &lt;- list(prob = 0.3, reward_heads = 3, reward_tails = -5)

# Define function to flip a coin 
flip_coin &lt;- function(coin) {
  if (runif(1) &lt; coin$prob) {
    return(coin$reward_heads)
  } else {
    return(coin$reward_tails)
  }
}

play_game &lt;- function(n=NULL) { # added &quot;n&quot; as input
  
  # init
  player_names &lt;- c(&quot;player 1&quot;, &quot;player 2&quot;)
  
  game_log &lt;- data.frame(turn_no = numeric(), 
                         current_player_turn = character(), 
                         coin_chosen = character(),
                         player_1_current_score = numeric(), 
                         player_2_current_score = numeric(),
                         stringsAsFactors = FALSE)
  
  # Set up initial scores and game state
  game_state &lt;- list(current_player = 1, game_log = game_log)
  
  game_no = n
  
  current_player &lt;- 1 # which(player_names==current_player)

  player_scores &lt;- c(0, 0)
  
  # Play the game until a player reaches a score of -10 // careful here!!
  while (min(player_scores) &gt; -10) {    # while (max(player_scores[1]) &lt; 10 | max(player_scores[2]) &lt; 10) {
    
    # Choose a coin at random
    coin_choice &lt;- sample(c(&quot;coin1&quot;, &quot;coin2&quot;, &quot;coin3&quot;), 1)
    
    # Flip the chosen coin
    coin_result &lt;- flip_coin(get(coin_choice))
    
    # Update the current player&#39;s score
    player_scores[current_player] &lt;- player_scores[current_player] + coin_result
    
    # Log the current turn&#39;s information
    turn_info &lt;- data.frame(turn_no = nrow(game_log) + 1,
                            current_player_turn = player_names[current_player],
                            coin_chosen = coin_choice, game_no = n,
                            player_1_current_score = player_scores[1],
                            player_2_current_score = player_scores[2])
    
    game_log &lt;- rbind(game_log, turn_info)
    
    # Switch to the other player&#39;s turn
    current_player &lt;- ifelse(current_player == 1, 2, 1)
  }
  
  # Return the game log
  return(game_log)
}

lapply(1:5, \(x) play_game(x))

First result:

[[1]]
   turn_no current_player_turn coin_chosen game_no player_1_current_score player_2_current_score
1        1            player 1       coin1       1                    1.0                    0.0
2        2            player 2       coin1       1                    1.0                    1.0
3        3            player 1       coin2       1                    0.9                    1.0
4        4            player 2       coin1       1                    0.9                    2.0
5        5            player 1       coin2       1                    1.1                    2.0
6        6            player 2       coin2       1                    1.1                    2.2
7        7            player 1       coin2       1                    1.3                    2.2
8        8            player 2       coin2       1                    1.3                    2.4
9        9            player 1       coin1       1                    2.3                    2.4
10      10            player 2       coin3       1                    2.3                   -2.6
11      11            player 1       coin3       1                   -2.7                   -2.6
12      12            player 2       coin2       1                   -2.7                   -2.4
13      13            player 1       coin3       1                   -7.7                   -2.4
14      14            player 2       coin2       1                   -7.7                   -2.2
15      15            player 1       coin2       1                   -7.5                   -2.2
16      16            player 2       coin3       1                   -7.5                   -7.2
17      17            player 1       coin2       1                   -7.3                   -7.2
18      18            player 2       coin1       1                   -7.3                   -6.2
19      19            player 1       coin3       1                   -4.3                   -6.2
20      20            player 2       coin3       1                   -4.3                  -11.2

huangapple
  • 本文由 发表于 2023年3月3日 18:22:23
  • 转载请务必保留本文链接:https://go.coder-hub.com/75625837.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定