Why does my tic tac toe minimax algorithm utilizing a 1d representation of a tic tac toe board and allowing for play of both sides not work?

huangapple go评论96阅读模式
英文:

Why does my tic tac toe minimax algorithm utilizing a 1d representation of a tic tac toe board and allowing for play of both sides not work?

问题

我编写了一个使用极小化极大算法来玩井字游戏的Python程序。棋盘用一个一维数组表示。玩家可以选择扮演任意一方。程序会递归生成所有合法的棋盘,并在每一轮交替最小化和最大化来检查是否有赢家。但出于某种原因,该算法从未生成过获胜的棋盘。

起初我以为与算法没有正确地最小化和最大化右侧有关,但我检查了后发现代码无明显问题。

  1. import random
  2. # 其他代码...
  3. # 以下代码不要翻译
  4. # Main Code
  5. test()
  6. letter = 'X'
  7. board = [""] * 9
  8. player = random.choice([0, 1])
  9. print("You are playing as", player)
  10. while True:
  11. printBoard(board)
  12. winner = getWinner(board)
  13. if winner is not None:
  14. if winner == "Tie":
  15. print("Tie")
  16. else:
  17. print("Winner:", winner)
  18. break
  19. if player == 0:
  20. if letter == "O":
  21. move = findBestMove(board, letter, player)
  22. print("Computer played move", move)
  23. board[move] = letter
  24. else:
  25. move = int(input("Enter move (0-8): "))
  26. while board[move] != "":
  27. move = int(input("Invalid move. Enter move (0-8): "))
  28. board[move] = letter
  29. elif player == 1:
  30. if letter == "X":
  31. move = findBestMove(board, letter, player)
  32. print("Computer played move", move)
  33. board[move] = letter
  34. else:
  35. move = int(input("Enter move (0-8): "))
  36. while board[move] != "":
  37. move = int(input("Invalid move. Enter move (0-8): "))
  38. board[move] = letter
  39. letter = getOtherPlayer(letter)
英文:

I wrote a python program utilizing the minimax algorithm to play tic tac toe. The board is represented by a 1d array. The player is able to play as either side. The program recursively generates all of the legal boards and checks for a winner alternating between minimizing and maximizing every turn. For some reason the algorithm never generates a winning board.

At first I assumed it was related to the algorithm not minimizing and maximizing for the right side but I checked and I don't see any reason the code wouldn't work

  1. import random
  2. def getOtherPlayer(letter):
  3. if letter == "X":
  4. return "O"
  5. else:
  6. return "X"
  7. def getWinner(board):
  8. for i in range(3):
  9. # Check rows
  10. if board[i*3] == board[i*3+1] == board[i*3+2] != "":
  11. return board[i*3]
  12. # Check columns
  13. if board[i] == board[i+3] == board[i+6] != "":
  14. return board[i]
  15. # Check diagonals
  16. if board[0] == board[4] == board[8] != "":
  17. return board[0]
  18. if board[2] == board[4] == board[6] != "":
  19. return board[2]
  20. # Check for tie
  21. if "" not in board:
  22. return "Tie"
  23. # No winner yet
  24. return None
  25. def test():
  26. board = ['X', '', 'O', 'X', 'O', 'X', 'O', 'X', 'O']
  27. print(getScore(board, 'X', False))
  28. def getAvailableMoves(board):
  29. return [i for i in range(len(board)) if board[i] == ""]
  30. def getScore(board, letter, isMaximizing):
  31. winner = getWinner(board)
  32. if winner == letter and isMaximizing:
  33. print('win')
  34. return 1 # current player wins
  35. elif winner == letter and isMaximizing is False:
  36. print('lose')
  37. return -1
  38. else:
  39. print(board)
  40. print(letter)
  41. print(isMaximizing)
  42. print('tie')
  43. return 0 # tie (no winner)
  44. def minimax(board, letter, player):
  45. if player == 0:
  46. if letter == "X":
  47. bestScore = float('inf')
  48. isMaximizing = False
  49. else:
  50. bestScore = float('-inf')
  51. isMaximizing = True
  52. if player == 1:
  53. if letter == "O":
  54. bestScore = float('inf')
  55. isMaximizing = False
  56. else:
  57. bestScore = float('-inf')
  58. isMaximizing = True
  59. winner = getWinner(board)
  60. if winner is not None:
  61. print("terminal")
  62. return getScore(board, letter, isMaximizing)
  63. availableMoves = getAvailableMoves(board)
  64. if isMaximizing:
  65. for move in availableMoves:
  66. board[move] = letter
  67. score = minimax(board, getOtherPlayer(letter), player)
  68. board[move] = ""
  69. bestScore = max(score, bestScore)
  70. else:
  71. for move in availableMoves:
  72. board[move] = letter
  73. print(board)
  74. score = minimax(board, getOtherPlayer(letter), player)
  75. board[move] = ""
  76. bestScore = min(score, bestScore)
  77. return bestScore
  78. def findBestMove(board, startingLetter, player):
  79. availableMoves = getAvailableMoves(board)
  80. bestMove = None
  81. bestScore = -9999
  82. for move in availableMoves:
  83. board[move] = startingLetter
  84. score = minimax(board, getOtherPlayer(startingLetter), player)
  85. board[move] = ""
  86. if score > bestScore:
  87. bestScore = score
  88. bestMove = move
  89. return bestMove
  90. def printBoard(board):
  91. print()
  92. for i in range(3):
  93. print(board[i * 3:i * 3 + 3])
  94. # Main Code
  95. test()
  96. letter = 'X'
  97. board = [""] * 9
  98. player = random.choice([0, 1])
  99. print("You are playing as", player)
  100. while True:
  101. printBoard(board)
  102. winner = getWinner(board)
  103. if winner is not None:
  104. if winner == "Tie":
  105. print("Tie")
  106. else:
  107. print("Winner:", winner)
  108. break
  109. if player == 0:
  110. if letter == "O":
  111. move = findBestMove(board, letter, player)
  112. print("Computer played move", move)
  113. board[move] = letter
  114. else:
  115. move = int(input("Enter move (0-8): "))
  116. while board[move] != "":
  117. move = int(input("Invalid move. Enter move (0-8): "))
  118. board[move] = letter
  119. elif player == 1:
  120. if letter == "X":
  121. move = findBestMove(board, letter, player)
  122. print("Computer played move", move)
  123. board[move] = letter
  124. else:
  125. move = int(input("Enter move (0-8): "))
  126. while board[move] != "":
  127. move = int(input("Invalid move. Enter move (0-8): "))
  128. board[move] = letter
  129. letter = getOtherPlayer(letter)

答案1

得分: 3

问题出在你的 getScore 函数中。

只有在赢家是 letter 时才返回非零值,但是当赢家是另一名玩家时却不返回。然而这将是你会遇到的情况:如果有一方获胜,那将是由最后移动的玩家来完成的,而在你调用 getScore 时,letter 实际上是下一个要出手的玩家。因此,即使游戏已经获胜,getScore 也会返回0。

以下是对该函数的建议修正:

  1. def getScore(board, letter, isMaximizing):
  2. winner = getWinner(board)
  3. # 当游戏尚未结束,或者是平局时,返回0
  4. if not winner or winner == '平局':
  5. return 0
  6. # 当我们到达这里时,我们知道游戏已经结束。
  7. # 当 `letter` 赢家并且正在最大化时,或者
  8. # 当 `letter` 输家并且正在最小化时,返回1
  9. # 当 `letter` 输家并且正在最大化时,或者
  10. # 当 `letter` 赢家并且正在最小化时,返回-1
  11. # 简而言之:如果表达式 `winner == letter` 的布尔值与
  12. # `isMaximizing` 的布尔值相同,则返回1,否则返回-1
  13. return 1 if (winner == letter) == isMaximizing else -1

一个小提醒:在分析你的代码时,我发现你使用了一个变量 player,确定了人类玩家是否使用 "X" 和 "O",但该变量是一个数字 (0 或 1)。你可以通过将 "X" 和 "O" 也用于 player 值来避免一些代码重复。

另外,为什么不始终将 "X" 设定为最大化的玩家,无论是人类还是非人类玩家呢?当然,在 findBestMove 函数中需要更加小心一些,但其他的都会变得更简单。

英文:

The problem is in your getScore function.

It only returns a non-zero value when the winner is letter, but not when the winner is the other player. Yet that is the situation you will have: if there is a win, it will have been played by the player that moved last, while in your calls to getScore, letter is actually the player who is next to play. As a consequence, getScore will return 0 even when the game has been won.

Here is a suggested correction of that function:

  1. def getScore(board, letter, isMaximizing):
  2. winner = getWinner(board)
  3. # When the game is not over yet, or it is a tie, then return 0
  4. if not winner or winner == 'Tie':
  5. return 0
  6. # When we get here, we know the game has been won.
  7. # Return 1 when `letter` is the winner and is maximizing, or
  8. # when `letter` is the loser and is minimizing
  9. # Return -1 when `letter` is the loser and is maximizing, or
  10. # when `letter` is the winner and is minimizing
  11. # In short: if the expression `winner == letter` has the same boolean value as
  12. # `isMaximizing` return 1, else return -1
  13. return 1 if (winner == letter) == isMaximizing else -1

A small remark: When analysing your code I found it quite confusing that you used a variable player which determines whether the human player plays with "X" and "O", but the variable is a number (0 or 1). You can avoid some code repetition by using "X" and "O" also for the player value.

Also, why not make "X" always the maximizing player, no matter if it is the human or non-human player? Sure, you'd need a bit more care in the findBestMove function, but all the rest would become simpler.

huangapple
  • 本文由 发表于 2023年2月27日 09:15:25
  • 转载请务必保留本文链接:https://go.coder-hub.com/75576062.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定