带有 JavaScript 的井字游戏：带有 Minimax 算法的 AI 玩家

Tic-Tac-Toe with JavaScript: AI Player with Minimax Algorithm | Ali Alaa - Front-end Web Developerhttps://alialaa.com/blog/tic-tac-toe-js-minimax

在上一部分中，我们为井字游戏棋盘创建了一个 JavaScript 类。让我们在这部分学习如何在 JavaScript 中实现极小极大算法，以便为计算机玩家获得最佳移动。给定一些假设和深度（要计算的转数），我们将计算每个可能的移动的启发式值。

查看演示或访问项目的 github 页面。

这是 3 部分系列的第二部分。您可以在下面找到其他部分的列表：

第 1 部分：构建井字游戏棋盘
第 2 部分：使用 Minimax 算法的 AI 玩家
第 3 部分：构建用户界面

要创建 AI 玩家，我们需要模仿人类在玩井字游戏时的想法。在现实生活中，人类会考虑每一步的所有可能后果。这就是极小极大算法派上用场的地方。minimax 算法是一种决策规则，用于确定游戏中的最佳可能移动，其中所有可能的移动都可以预见，如井字游戏或国际象棋。

井字游戏中的极小极大算法

为了在两人游戏中应用极小极大算法，我们将假设 X 是最大化玩家，O 是最小化玩家。最大化玩家将尝试最大化他的分数，或者换句话说，选择具有最高价值的移动。最小化玩家将尝试最小化最大化玩家的价值，从而选择具有最小值的移动。

为了计算上述值，我们需要做出一些假设。我们将这些值称为启发式值。在井字游戏中，我们有 3 种可能性：

棋盘状态为平局：我们将给这个棋盘赋值为 0；
X 在棋盘状态下获胜：我们给这个棋盘赋值 100；
O 在棋盘状态下获胜：我们将给这个棋盘一个值-100；

带有博弈树的极小极大示例

为了更详细地说明极小极大算法，让我们看一个视觉示例。在下图中，考虑在当前状态下轮到 X 的情况。X 有三种可能的移动方式：

级别 1： X 有三个可能的移动，并试图找到最大节点。
级别 2：第一步导致 X 直接获胜，因此给予 100 分。
级别 2：第二步和第三步将导致轮到 O 的另外两个可能的步。
级别 3： O 试图最小化分数，因此它选择具有最小值的节点。
级别 3： O 的第一步将导致获胜，第二步将导致平局，因此我们假设 O 将选择第一步并且父节点的值为 -100。第三步和第四步也一样。
回到第 1 级，X 现在必须在 100、-100 和 0 之间进行选择。由于 X 是最大化者，它肯定会选择 100，这将导致获胜。

正如您所注意到的，我们递归地传播可能性树，计算每个终端状态的分数，然后返回决定我们将采取的行动。

增加计算的深度

现在想象一种情况，X 可以通过两种可能的方式获胜，但一种方式比另一种方式需要更多的动作。如果我们遵循当前的实现，两个动作都将返回 100 分。然后我们将随机选择动作；但是，如果我们直接选择最短的获胜方式会更好。

为了解决这个问题，我们将从棋盘的分数中减去深度或当前级别，以防玩家是最大化玩家，或者将深度添加到分数以防玩家是最小化玩家。这样，对于最大化的玩家，较短的路径将获得更高的分数，因为从中减去了较低的深度，反之亦然，对于最小化的玩家。

如果有更短的放松方式，这种方式也将有助于以更长的方式输掉。这是添加深度后的视觉示例：

在上面的例子中，X 显然会选择 99 而不是 97，因为这是一种更容易获胜的方式。

JavaScript 实现

现在是时候将这个理论转化为代码了。在我们的classes文件夹中，让我们创建一个名为player.js的新文件。Player 类将使用最大深度参数构造。此参数将用于限制计算机在树中传播的深度。这样我们选择的深度越低，游戏就越容易。除此之外，我们将定义一个新地图。此地图的键将保存某个启发式值，并且该值将保存一个逗号分隔的字符串，用于所有导致该值的移动。这样，对于最大化玩家，我们可以选择最高的键，如果有多个值，移动将是值或随机值。

player.js

import Board from "./board.js";

export default class Player {
    constructor(maxDepth = -1) {
        this.maxDepth = maxDepth;
        this.nodesMap = new Map();
    }
}

现在让我们为这个类添加一个名为getBestMove()的方法。如前所述，这将是一个递归函数。该函数将接收一个棋盘实例、一个用于决定玩家是最大化还是最小化的布尔值、一个在计算后运行的回调函数（这将在我们构建 UI 时使用）和当前节点的深度。

在我们的函数内部，每个调用都会有不同的深度，具体取决于我们当前所处的级别。为了在主函数调用（即在最顶层而不是递归调用）执行一些操作，我们将检查深度是否等于零。

如果我们正在计算最顶层节点的值，我们将在函数内部做的第一件事是从先前的计算中清除nodesMap映射。

然后我们将添加递归函数的基础。每个递归函数都必须有一个递归停止的基点或点，否则我们可能会以无限递归结束。在我们的例子中，当达到终端状态或深度达到最大深度时，递归将停止。在这种情况下，我们将返回状态的启发式值：

player.js

getBestMove(board, maximizing = true, callback = () => {}, depth = 0) {
    //clear nodesMap if the function is called for a new move
    if(depth == 0) this.nodesMap.clear();

    //If the board state is a terminal one, return the heuristic value
    if(board.isTerminal() || depth === this.maxDepth ) {
        if(board.isTerminal().winner === 'x') {
            return 100 - depth;
        } else if (board.isTerminal().winner === 'o') {
            return -100 + depth;
        }
        return 0;
    }
}

现在我们将检查是否轮到最大化玩家，然后使用我们在上一部分中创建的getAvailableMoves()方法遍历所有空单元格。在循环内部，我们将创建一个新棋盘并将符号插入到循环中的当前空单元格中，然后递归调用getBestMove()，但这次使用新棋盘，最小化玩家转弯并增加一个深度。之后，我们将函数的输出与当前的最佳值进行比较，并在需要时对其进行更新。仍然在循环内部，我们检查我们是否在顶层，如果是，我们将值存储在nodesMap中。

在循环之外，我们检查我们是否处于顶层并返回对应于最佳值的单元格的索引，如果多个索引具有最佳值，则返回随机索引。

player.js

if (maximizing) {
    //Initialize best to the lowest possible value
    let best = -100;
    //Loop through all empty cells
    board.getAvailableMoves().forEach(index => {
        //Initialize a new board with a copy of our current state
        const child = new Board([...board.state]);
        //Create a child node by inserting the maximizing symbol x into the current empty cell
        child.insert("x", index);
        //Recursively calling getBestMove this time with the new board and minimizing turn and incrementing the depth
        const nodeValue = this.getBestMove(child, false, callback, depth + 1);
        //Updating best value
        best = Math.max(best, nodeValue);

        //If it's the main function call, not a recursive one, map each heuristic value with it's moves indices
        if (depth == 0) {
            //Comma separated indices if multiple moves have the same heuristic value
            const moves = this.nodesMap.has(nodeValue)
                ? `${this.nodesMap.get(nodeValue)},${index}`
                : index;
            this.nodesMap.set(nodeValue, moves);
        }
    });
    //If it's the main call, return the index of the best move or a random index if multiple indices have the same value
    if (depth == 0) {
        if (typeof this.nodesMap.get(best) == "string") {
            const arr = this.nodesMap.get(best).split(",");
            const rand = Math.floor(Math.random() * arr.length);
            const ret = arr[rand];
        } else {
            ret = this.nodesMap.get(best);
        }
        //run a callback after calculation and return the index
        callback(ret);
        return ret;
    }
    //If not main call (recursive) return the heuristic value for next calculation
    return best;
}

同样，我们将检查玩家是否正在最小化，并且我们的代码将非常相似，除了插入 o 代替 x 和使用 Math.min 代替 Math.max 等细微变化。这是我们的最后的类：

player.js

import Board from "./board.js";

export default class Player {
    constructor(maxDepth = -1) {
        this.maxDepth = maxDepth;
        this.nodesMap = new Map();
    }
    getBestMove(board, maximizing = true, callback = () => {}, depth = 0) {
        //clear nodesMap if the function is called for a new move
        if (depth == 0) this.nodesMap.clear();

        //If the board state is a terminal one, return the heuristic value
        if (board.isTerminal() || depth === this.maxDepth) {
            if (board.isTerminal().winner === "x") {
                return 100 - depth;
            } else if (board.isTerminal().winner === "o") {
                return -100 + depth;
            }
            return 0;
        }
        if (maximizing) {
            //Initialize best to the lowest possible value
            let best = -100;
            //Loop through all empty cells
            board.getAvailableMoves().forEach(index => {
                //Initialize a new board with a copy of our current state
                const child = new Board([...board.state]);
                //Create a child node by inserting the maximizing symbol x into the current empty cell
                child.insert("x", index);
                //Recursively calling getBestMove this time with the new board and minimizing turn and incrementing the depth
                const nodeValue = this.getBestMove(child, false, callback, depth + 1);
                //Updating best value
                best = Math.max(best, nodeValue);

                //If it's the main function call, not a recursive one, map each heuristic value with it's moves indices
                if (depth == 0) {
                    //Comma separated indices if multiple moves have the same heuristic value
                    const moves = this.nodesMap.has(nodeValue)
                        ? `${this.nodesMap.get(nodeValue)},${index}`
                        : index;
                    this.nodesMap.set(nodeValue, moves);
                }
            });
            //If it's the main call, return the index of the best move or a random index if multiple indices have the same value
            if (depth == 0) {
                let returnValue;
                if (typeof this.nodesMap.get(best) == "string") {
                    const arr = this.nodesMap.get(best).split(",");
                    const rand = Math.floor(Math.random() * arr.length);
                    returnValue = arr[rand];
                } else {
                    returnValue = this.nodesMap.get(best);
                }
                //run a callback after calculation and return the index
                callback(returnValue);
                return returnValue;
            }
            //If not main call (recursive) return the heuristic value for next calculation
            return best;
        }

        if (!maximizing) {
            //Initialize best to the highest possible value
            let best = 100;
            //Loop through all empty cells
            board.getAvailableMoves().forEach(index => {
                //Initialize a new board with a copy of our current state
                const child = new Board([...board.state]);

                //Create a child node by inserting the minimizing symbol o into the current empty cell
                child.insert("o", index);

                //Recursively calling getBestMove this time with the new board and maximizing turn and incrementing the depth
                let nodeValue = this.getBestMove(child, true, callback, depth + 1);
                //Updating best value
                best = Math.min(best, nodeValue);

                //If it's the main function call, not a recursive one, map each heuristic value with it's moves indices
                if (depth == 0) {
                    //Comma separated indices if multiple moves have the same heuristic value
                    const moves = this.nodesMap.has(nodeValue)
                        ? this.nodesMap.get(nodeValue) + "," + index
                        : index;
                    this.nodesMap.set(nodeValue, moves);
                }
            });
            //If it's the main call, return the index of the best move or a random index if multiple indices have the same value
            if (depth == 0) {
                let returnValue;
                if (typeof this.nodesMap.get(best) == "string") {
                    const arr = this.nodesMap.get(best).split(",");
                    const rand = Math.floor(Math.random() * arr.length);
                    returnValue = arr[rand];
                } else {
                    returnValue = this.nodesMap.get(best);
                }
                //run a callback after calculation and return the index
                callback(returnValue);
                return returnValue;
            }
            //If not main call (recursive) return the heuristic value for next calculation
            return best;
        }
    }
}

现在让我们测试一下这个函数，同时看看nodesMap的地图是什么样子的。在script.js中输入：

script.js

import Board from "./classes/board.js";
import Player from "./classes/player.js";

const board = new Board(["x", "o", "", "", "", "", "o", "", "x"]);
board.printFormattedBoard();
const p = new Player();
console.log(p.getBestMove(board));
console.log(p.nodesMap);

如您所见，单元格 4 被确定为 X 的最佳移动，因为它将导致直接获胜。让我们看一下同一个棋盘，但这次轮到 O 了：

script.js

import Board from "./classes/board.js";
import Player from "./classes/player.js";

const board = new Board(["x", "o", "", "", "", "", "o", "", "x"]);
board.printFormattedBoard();
const p = new Player();
console.log(p.getBestMove(board, false)); //false for minimizing turn
console.log(p.nodesMap);

显然，4 也是 O 的最佳移动，因为它可以防止损失。

最后，值得一提的是，使用alpha-beta pruning可以提高该算法的性能。如果你有兴趣，你可以看看那个。

在下一个也是最后一部分，我们将为我们的板构建 UI 和交互。