2023年7月7日 04:39:10go评论104阅读模式

英文:

My minimax algorithm loses to me, but seems flawless

问题

我正在尝试使用Python编写一个井字棋机器人，使用极小化极大算法。在我看来，我的代码应该可以工作，但它错误地评估了位置，搜索了太多或太少的节点，并且在每场比赛中都输给我。

mainboard = ["-", "-", "-", "-", "-", "-", "-", "-", "-"]
nodes = 0
def detectwin(b):
    signs = ["O", "X"]
    for s in signs:
        for i in range(3):
            j = 3 * i
            if ((b[0 + j] == s and b[1 + j] == s and b[2 + j] == s) or
                (b[0 + i] == s and b[1 + i] == s and b[2 + i] == s)):
                if s == "O": return 1
                if s == "X": return -1
        if ((b[0] == s and b[4] == s and b[8] == s) or
            (b[2] == s and b[4] == s and b[6] == s)):
            if s == "O": return 1
            if s == "X": return -1
    return 0
def evaluate(board):
    return detectwin(board)
def fullboard(board):
    return all(cell != "-" for cell in board)
def makemove(board, move, maximizingPlayer):
    if maximizingPlayer:
        board[move] = "O"
        return board
    else:
        board[move] = "X"
        return board
def undomove(board, move):
    board[move] = "-"
    return board
def minimax(board, depth, maximizingPlayer):
    global nodes
    if depth == 0 or fullboard(board) or detectwin(board) != 0:
        nodes += 1
        return evaluate(board)
    if maximizingPlayer:
        maxEval = -1000
        for i in range(9):
            if board[i] == "-":
                board = makemove(board, i, True)
                newEval = minimax(board, depth - 1, False)
                maxEval = max(maxEval, newEval)
                board = undomove(board, i)
        return maxEval
    else:
        minEval = 1000
        for i in range(9):
            if board[i] == "-":
                board = makemove(board, i, False)
                newEval = minimax(board, depth - 1, True)
                minEval = min(minEval, newEval)
                board = undomove(board, i)
        return minEval
def findbestmove(board, maximizingPlayer):
    global nodes
    if maximizingPlayer:
        bestmove = -1
        maxEval = -1000
        for i in range(9):
            if board[i] == "-":
                board = makemove(board, i, True)
                nodes = 0
                newEval = minimax(board, 9, False)
                print(f"Eval move {i}: {newEval} ({nodes} nodes)")
                if newEval > maxEval:
                    maxEval = newEval
                    bestmove = i
                board = undomove(board, i)
        return bestmove
def printboard(b):
    signs = ["No", "O", "X"]
    win = signs[detectwin(b)] + " wins"
    print(f'{b[0]} {b[1]} {b[2]}\n{b[3]} {b[4]} {b[5]}\n{b[6]} {b[7]} {b[8]}\n{win}\n')
print("Ready!")
while True:
    move = findbestmove(mainboard, True)
    mainboard = makemove(mainboard, move, True)
    printboard(mainboard)
    yourmove = int(input())
    mainboard = makemove(mainboard, yourmove, False)
    printboard(mainboard)

我尝试评估不同的位置，期望它可以通过搜索每种可能性来正确地给出位置的客观评估。但它却错误地搜索了错误数量的节点并错误地评估了位置。为什么会这样呢？

英文:

I'm trying to code a tic-tac-toe bot using the minimax algorithm in python. My code seems like it should work to me, but misevaluates positions, searches too many or too few nodes and loses to me every game.

mainboard = [&quot;-&quot;, &quot;-&quot;, &quot;-&quot;, &quot;-&quot;, &quot;-&quot;, &quot;-&quot;, &quot;-&quot;, &quot;-&quot;, &quot;-&quot;]
nodes = 0
def detectwin(b):
signs = [&quot;O&quot;, &quot;X&quot;]
for s in signs:
for i in range(3):
j = 3 * i
if ((b[0 + j]==s and b[1 + j]==s and b[2 + j]==s) or
(b[0 + i]==s and b[1 + i]==s and b[2 + i]==s)):
if s == &quot;O&quot;: return 1
if s == &quot;X&quot;: return -1
if ((b[0]==s and b[4]==s and b[8]==s) or
(b[2]==s and b[4]==s and b[6]==s)):
if s == &quot;O&quot;: return 1
if s == &quot;X&quot;: return -1
return 0
def evaluate(board):
return detectwin(board)
def fullboard(board):
return all(cell != &quot;-&quot; for cell in board)
def makemove(board, move, maximizingPlayer):
if maximizingPlayer:
board[move] = &quot;O&quot;
return board
else:
board[move] = &quot;X&quot;
return board
def undomove(board, move):
board[move] = &quot;-&quot;
return board
def minimax(board, depth, maximizingPlayer):
global nodes
if depth == 0 or fullboard(board) or detectwin(board) != 0:
nodes += 1
return evaluate(board)
if maximizingPlayer:
maxEval = -1000
for i in range(9):
if board[i] == &quot;-&quot;:
board = makemove(board, i , True)
newEval = minimax(board, depth-1, False)
maxEval = max(maxEval, newEval)
board = undomove(board, i)
return maxEval
else:
minEval = 1000
for i in range(9):
if board[i] == &quot;-&quot;:
board = makemove(board, i , False)
newEval = minimax(board, depth-1, True)
minEval = min(minEval, newEval)
board = undomove(board, i)
return minEval
def findbestmove(board, maximizingPlayer):
global nodes
if maximizingPlayer:
bestmove = -1
maxEval = -1000
for i in range(9):
if board[i] == &quot;-&quot;:
board = makemove(board, i , True)
nodes = 0
newEval = minimax(board, 9, False)
print(f&quot;Eval move {i}: {newEval} ({nodes} nodes)&quot;)
if newEval &gt; maxEval:
maxEval = newEval
bestmove = i
board = undomove(board, i)
return bestmove
def printboard(b):
signs = [&quot;No&quot;, &quot;O&quot;, &quot;X&quot;]
win = signs[detectwin(b)] + &quot; wins&quot;
print(f&#39;{b[0]} {b[1]} {b[2]}\n{b[3]} {b[4]} {b[5]}\n{b[6]} {b[7]} {b[8]}\n{win}\n&#39;)
print(&quot;Ready!&quot;)
while True:
move = findbestmove(mainboard, True)
mainboard = makemove(mainboard, move, True)
printboard(mainboard)
yourmove = int(input())
mainboard = makemove(mainboard, yourmove, False)
printboard(mainboard)

I've tried evaluating different positions, expecting it to give the correct objective evaluation of the postion by searching every possibility. Instead it searches the wrong number of nodes and evaluates positions incorrectly. Why does it do this?

答案1

得分: 7

在你的 detectwin 函数中的 for 循环中，(b[0 + j]==s and b[1 + j]==s and b[2 + j]==s) 实际上是在检查行，因为每次迭代增加 j 3，你在游戏板上移动到下一行。然而，(b[0 + i]==s and b[1 + i]==s and b[2 + i]==s) 也在检查行而不是列，只是以不同的方式。这是因为每次迭代增加 i 1，而不乘以 3，所以你仍然在相同的行内检查 1D 游戏板表示。

所以问题在于你重复检查了行，而根本没有检查列。

当前的列检查循环：

for i in range(3):
    j = 3 * i
    if ((b[0 + j]==s and b[1 + j]==s and b[2 + j]==s) or
        (b[0 + i]==s and b[1 + i]==s and b[2 + i]==s)):
        if s == "O": return 1
        if s == "X": return -1

这根本不检查第二和第三列。修复后的：

for i in range(3):
    j = 3 * i
    if ((b[0 + j]==s and b[1 + j]==s and b[2 + j]==s) or
        (b[i]==s and b[i + 3]==s and b[i + 6]==s)):
        if s == "O": return 1
        if s == "X": return -1

通过将 (b[0 + i]==s and b[1 + i]==s and b[2 + i]==s) 替换为 (b[i]==s and b[i + 3]==s and b[i + 6]==s)，检查将会遍历游戏板的列。希望这有所帮助。

英文:

In the for loop in your detectwin function, (b[0 + j]==s and b[1 + j]==s and b[2 + j]==s) is actually checking rows, because when you increase j by 3 each iteration, you're moving to the next row in the board. However, (b[0 + i]==s and b[1 + i]==s and b[2 + i]==s) is also checking rows instead of columns, just in a different way. This is because when you increase i by 1 each iteration without multiplying it by 3, you're staying within the same row of the 1D board representation.

So the issue is that you're checking the rows twice and the columns not at all.

The current column checking loop:

for i in range(3):
j = 3 * i
if ((b[0 + j]==s and b[1 + j]==s and b[2 + j]==s) or
(b[0 + i]==s and b[1 + i]==s and b[2 + i]==s)):
if s == &quot;O&quot;: return 1
if s == &quot;X&quot;: return -1

which doesn't check the second and third column at all. Fixed:

for i in range(3):
j = 3 * i
if ((b[0 + j]==s and b[1 + j]==s and b[2 + j]==s) or
(b[i]==s and b[i + 3]==s and b[i + 6]==s)):
if s == &quot;O&quot;: return 1
if s == &quot;X&quot;: return -1

By replacing (b[0 + i]==s and b[1 + i]==s and b[2 + i]==s) with (b[i]==s and b[i + 3]==s and b[i + 6]==s) the check will go through the columns of the game board. Hope this helps.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

我的极小化极大算法输给了我，但似乎无懈可击。

问题

答案1

使用Pybind11并通过基础指针访问C++对象。

Deepdiff 如果有不匹配的情况，将打印所有的键。

There is a Python function that allows me to sum the last goals of a team in a dataframe.

如何在Kivy中每两秒将椭圆的颜色更改为随机整数？使用随机整数。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。