一、问题描述
利用极小极大搜索和alpha-beta剪枝算法预测五子棋落子问题,初始棋局如图所示,AI为白子,玩家为黑子,当前由AI落子。
二、算法描述
(一)极小化极大算法:
极小化极大搜索是一种在有限的深度范围内搜索博弈树的求解方法,程序代表AI方MAX节点,目的是打败玩家,基本原理为:
(1)轮到MIN落子时,MAX节点考虑最坏的情况,即评估函数取极小值。
(2)轮到MAX落子时,MAX节点考虑最好的情况,即评估函数取极大值。
(3)搜索到叶子节点进行回溯,代表双方的对抗策略,交替使用(1)(2)规则回溯到root节点即可得到评估值。
function minimax(node, depth) // 给定初始状态和搜索深度 if node is a terminal node or depth = 0 return the evaluate value of the node //使用评估函数返回局面得分 if player’s turn // 玩家走棋,是极小节点,选择一个得分最小的走法 let val := +∞ foreach child of node val := min(val, minimax(child, depth-1) else AI’s turn //AI走棋,是极大节点,选择一个得分最大的走法 let val := -∞ foreach child of node val := max(val, minimax(child, depth-1)) return val;
(二)Alpha-beta算法:
极小化极大算法的搜索效率非常低下,而Alpha-beta剪枝算法能够提高搜索效率,基本原理为:
(1)alpha剪枝:任何极小层(由MIN落子)的节点的beta值都不大于其前驱节点(MAX节点)的alpha值,即如果在搜索过程中发现MIN节点的评估值要大于其前驱MAX节点的评估值,那么代表此MIN节点落子状态对MAX不利,可以舍弃。
(2)beta剪枝:任何极大层(由MAX落子)的节点的alpha值都不小于其前驱节点(MIN节点)的beta值,即如果在搜索过程中发现MAX节点的评估值要小于其前驱MIN节点的评估值,那么代表此MAX节点落子状态对MAX不利,可以舍弃。
function alphaBeta(node, alpha, beta , depth) if node is a terminal node or depth = 0 return the evaluate value of node //使用评估函数返回局面得分 else if player’s turn foreach child of node val := alphaBeta(child, alpha, beta, depth-1) if(val > alpha) alpha:= val if(alpha >= beta) break return alpha else AI’s turn foreach child of node val := alphaBeta(child, alpha, beta, depth-1) if(val < beta) beta:= val if(alpha >= beta) break return beta
三、评估函数
评估函数用于对博弈树中的叶子节点的状态进行评估,需要考虑五子棋中的基本棋型和特点,对叶子节点的棋局进行评估,给出评估值。
五子棋中的基本棋型(1代表AI落子,2代表玩家落子,0代表空位):
1. 连五:五颗同色棋子连在一起,如11111,22222
2. 活四:有两个点可以形成连五,如011110,022220
3. 冲四:有一个点可以形成连五,如011112,122220
4. 活三:可以形成活四的三点,如001110,002220
5. 眠三:只能形成冲四的三点,如001112,002221
6. 活二:能够形成活三的二点,如000110,000220
7. 眠二:能够形成眠三的二点,如000112,000221
在程序中可以某一坐标为中心,将改坐标点横竖撇捺四个方向的状态拼接为字符串,判断字符串是否包含上述的某种棋型作为判断标准。
由于算法是针对AI而言,因此在评估函数中,对玩家方赋予负值,AI方赋予正值。对于棋盘中的落子,从横竖撇捺四个方向判断形成的基本棋型,对不同的棋型赋予不同的权重,如连五代表一方胜利,赋予最大值代表AI胜利,赋予最小值代表玩家胜利。
根据棋型的重要性,划分权重如下(AI权重为正,玩家权重为负):
棋型 |
权重 |
连五 |
100000000 |
活四 |
10000000 |
冲四 |
1000000 |
活三 |
100000 |
眠三 |
10000 |
活二 |
1000 |
眠二 |
100 |
仅一 |
10 |
无 |
1 |
(一)评估函数v1
在评估过程中,计算AI所有落子位置横竖撇捺四个方向形成的棋型,得出评估值作为叶子节点的评估值。
效果:此种评估方式效果很差,仅对AI落子点进行判断过于片面,且会造成急于进攻疏于防守的局面。
(二)评估函数v2
在评估过程中,将棋盘中的所有落子的评估值相加得出最后的评估值。最终得到的评估值实际为AI落子形成的棋局评估值减玩家落子形成的棋局评估值。按此计算的目的是平衡进攻和防守。以叶子节点的评估值进行回溯,进而选择初始状态的下一步落子。
效果:评估结果较好,能够平衡进攻和防守。
四、参考资料
极小极大搜索方法、负值最大算法和Alpha-Beta搜索方法
五、源代码(Java实现)
public class Main { public final static int AI = 1; public final static int PLAYER = 2; public final static int BLANK = 0; private final static int WIN = 0; private final static int HUO4 = 1; private final static int CHONG4 = 2; private final static int HUO3 = 3; private final static int MIAN3 = 4; private final static int HUO2 = 5; private final static int MIAN2 = 6; private final static int OL1 = 7; private final static int NONE = 8; private final static String[] WIN_AI = {"11111"}; private final static String[] WIN_PLAYER = {"22222"}; private final static String[] HUO4_AI = {"011110"}; private final static String[] HUO4_PLAYER = {"022220"}; private final static String[] CHONG4_AI = {"011112", "211110", "10111", "11011", "11101"}; private final static String[] CHONG4_PLAYER = {"022221", "122220", "20222", "22022", "22202"}; private final static String[] HUO3_AI = {"001110", "011100", "010110", "011010"}; private final static String[] HUO3_PLAYER = {"002220", "022200", "020220", "022020"}; private final static String[] MIAN3_AI = {"001112", "010112", "011012", "011102", "211100", "211010", "210110", "201110", "00111", "10011", "10101", "10110", "01011", "10011", "11001", "11010", "01101", "10101", "11001", "11100",}; private final static String[] MIAN3_PLAYER = {"002221", "020221", "022021", "022201", "122200", "122020", "120220", "102220", "00222", "20022", "20202", "20220", "02022", "20022", "22002", "22020", "02202", "20202", "22002", "22200",}; private final static String[] HUO2_AI = {"000110", "001010", "001100", "001100", "010100", "011000", "000110", "010010", "010100", "001010", "010010", "011000",}; private final static String[] HUO2_PLAYER = {"000220", "002020", "002200", "002200", "020200", "022000", "000220", "020020", "020200", "002020", "020020", "022000",}; private final static String[] MIAN2_AI = {"000112", "001012", "010012", "10001", "2010102", "2011002", "211000", "210100", "210010", "2001102"}; private final static String[] MIAN2_PLAYER = {"000221", "002021", "020021", "20002", "1020201", "1022001", "122000", "120200", "120020", "1002201"}; private final static String[] OL1_AI = {"1"}; private final static String[] OL1_PLAYER = {"2"}; private final static String[] NONE_ = {""}; // 棋盘宽度 private final static int BOARD_SIZE = 15; // 棋盘 private static int[][] board = new int[BOARD_SIZE][BOARD_SIZE]; private static int[][] score = new int[BOARD_SIZE][BOARD_SIZE]; private static final int INFINITY = 1000000000; public static void main(String[] args) { initBoard(); Node root = new Node(AI, 0, 0); int s = alphaBeta(root, -INFINITY, INFINITY, 1); List<Node> nodes = new ArrayList<>(); for (int i = 0; i < BOARD_SIZE; i++) { for (int j = 0; j < BOARD_SIZE; j++) { if (score[i][j] == s) { nodes.add(new Node(AI, i, j)); } } } System.out.println("落子策略评分:" + s); showScore(); Node node = nodes.get(0); board[node.x][node.y] = node.p; int v = computeScore(); board[node.x][node.y] = BLANK; int index = 0; for (int i = 0; i < nodes.size(); i++) { Node n = nodes.get(i); board[n.x][n.y] = n.p; int v1 = computeScore(); if (v1 > v) { index = i; } board[n.x][n.y] = BLANK; } node = nodes.get(index); board[node.x][node.y] = 999; System.out.println("下一步落子:(" + node.x + "," + node.y + ")"); show(); } public static int alphaBeta(Node node, int alpha, int beta, int depth) { if (checkSituation(node.p, getString(node.x, node.y), WIN)) { if (node.p == AI) return INFINITY; else return -INFINITY; } if (depth == 0) { return computeScore(); } if (node.p == AI) { for (int i = 0; i < BOARD_SIZE; i++) { for (int j = 0; j < BOARD_SIZE; j++) { if (isValid(i, j)) { board[i][j] = node.p; Node child = new Node(node); child.x = i; child.y = j; int val = alphaBeta(child, alpha, beta, depth - 1); board[i][j] = BLANK; if (val > alpha) { alpha = val; score[i][j] = alpha; } if (alpha >= beta) { break; } } } } return alpha; } else { for (int i = 0; i < BOARD_SIZE; i++) { for (int j = 0; j < BOARD_SIZE; j++) { if (isValid(i, j)) { board[i][j] = node.p; Node child = new Node(node); child.x = i; child.y = j; int val = alphaBeta(child, alpha, beta, depth - 1); board[i][j] = BLANK; if (val < beta) { beta = val; score[i][j] = beta; } if (alpha >= beta) { break; } } } } return beta; } } public static int computeScore() { int score = 0; for (int i = 0; i < BOARD_SIZE; i++) { for (int j = 0; j < BOARD_SIZE; j++) { if (board[i][j] != BLANK) { List<String> list = getString(i, j); if (checkSituation(AI, list, WIN)) { score += 100000000; } if (checkSituation(PLAYER, list, WIN)) { score += -100000000; } if (checkSituation(AI, list, HUO4)) { score += 10000000; } if (checkSituation(PLAYER, list, HUO4)) { score += -10000000; } if (checkSituation(AI, list, CHONG4)) { score += 1000000; } if (checkSituation(PLAYER, list, CHONG4)) { score += -1000000; } if (checkSituation(AI, list, HUO3)) { score += 100000; } if (checkSituation(PLAYER, list, HUO3)) { score += -100000; } if (checkSituation(AI, list, MIAN3)) { score += 10000; } if (checkSituation(PLAYER, list, MIAN3)) { score += -10000; } if (checkSituation(AI, list, HUO2)) { score += 1000; } if (checkSituation(PLAYER, list, HUO2)) { score += -1000; } if (checkSituation(AI, list, MIAN2)) { score += 100; } if (checkSituation(PLAYER, list, MIAN2)) { score += -100; } if (checkSituation(AI, list, OL1)) { score += 10; } if (checkSituation(PLAYER, list, OL1)) { score += -10; } if (checkSituation(AI, list, NONE)) { score += 1; } if (checkSituation(PLAYER, list, NONE)) { score += -1; } } } } return score; } public static boolean checkSituation(int p, List<String> list, int type) { switch (type) { case WIN: if (p == AI) { // 找AI在x,y处落子是否能连五 if (checkString(list, WIN_AI)) { return true; } } else if (p == PLAYER) { // 找玩家在x,y处落子是否能连五 if (checkString(list, WIN_PLAYER)) { return true; } } return false; case HUO4: if (p == AI) { // 找AI在x,y处落子是否能活4 if (checkString(list, HUO4_AI)) { return true; } } else if (p == PLAYER) { // 找玩家在x,y处落子是否能活4 if (checkString(list, HUO4_PLAYER)) { return true; } } return false; case CHONG4: if (p == AI) { // 找AI在x,y处落子是否能冲4 if (checkString(list, CHONG4_AI)) { return true; } } else if (p == PLAYER) { // 找玩家在x,y处落子是否能冲4 if (checkString(list, CHONG4_PLAYER)) { return true; } } return false; case HUO3: if (p == AI) { // 找AI在x,y处落子是否能活3 if (checkString(list, HUO3_AI)) { return true; } } else if (p == PLAYER) { // 找玩家在x,y处落子是否能活3 if (checkString(list, HUO3_PLAYER)) { return true; } } return false; case MIAN3: if (p == AI) { // 找AI在x,y处落子是否能眠3 if (checkString(list, MIAN3_AI)) { return true; } } else if (p == PLAYER) { // 找玩家在x,y处落子是否能眠3 if (checkString(list, MIAN3_PLAYER)) { return true; } } return false; case HUO2: if (p == AI) { // 找AI在x,y处落子是否能活2 if (checkString(list, HUO2_AI)) { return true; } } else if (p == PLAYER) { // 找玩家在x,y处落子是否能活2 if (checkString(list, HUO2_PLAYER)) { return true; } } return false; case MIAN2: if (p == AI) { // 找AI在x,y处落子是否能眠2 if (checkString(list, MIAN2_AI)) { return true; } } else if (p == PLAYER) { // 找玩家在x,y处落子是否能眠2 if (checkString(list, MIAN2_PLAYER)) { return true; } } return false; case OL1: if (p == AI) { if (checkString(list, OL1_AI)) { return true; } } else if (p == PLAYER) { if (checkString(list, OL1_PLAYER)) { return true; } } return false; case NONE: if (p == AI) { if (checkString(list, NONE_)) { return true; } } else if (p == PLAYER) { if (checkString(list, NONE_)) { return true; } } return false; } return true; } public static boolean checkString(List<String> list, String[] situation) { for (String str : list) { for (int i = 0; i < situation.length; i++) { if (str.contains(situation[i])) { return true; } } } return false; } public static List<String> getString(int x, int y) { List<String> strings = new ArrayList<>(); StringBuffer sb = new StringBuffer(); for (int i = 0, j = y; i < BOARD_SIZE; i++) { sb.append(board[i][j]); } strings.add(sb.toString()); sb.delete(0, sb.length()); for (int i = x, j = 0; j < BOARD_SIZE; j++) { sb.append(board[i][j]); } strings.add(sb.toString()); sb.delete(0, sb.length()); if (x + y < BOARD_SIZE) { for (int i = 0, j = x + y; i < BOARD_SIZE && j >= 0; i++, j--) { sb.append(board[i][j]); } } else { for (int i = x + y - 7, j = BOARD_SIZE - 1; i < BOARD_SIZE && j >= 0; i++, j--) { sb.append(board[i][j]); } } strings.add(sb.toString()); sb.delete(0, sb.length()); if (x <= y) { for (int i = 0, j = y - x; i < BOARD_SIZE && j < BOARD_SIZE; i++, j++) { sb.append(board[i][j]); } } else { for (int i = x - y, j = 0; i < BOARD_SIZE && j < BOARD_SIZE; i++, j++) { sb.append(board[i][j]); } } strings.add(sb.toString()); sb.delete(0, sb.length()); return strings; } public static void initBoard() { board[7][7] = PLAYER; board[7][6] = AI; board[8][6] = PLAYER; board[8][5] = AI; board[8][7] = PLAYER; board[9][5] = AI; board[9][6] = PLAYER; board[7][5] = AI; board[10][5] = PLAYER; board[7][8] = AI; board[8][8] = PLAYER; board[8][9] = AI; board[6][7] = PLAYER; board[9][7] = AI; board[6][6] = PLAYER; board[9][9] = AI; board[10][6] = PLAYER; board[11][6] = AI; board[10][7] = PLAYER; board[10][4] = AI; board[6][8] = PLAYER; } public static boolean isValid(int x, int y) { if (board[x][y] != BLANK) { return false; } if (x == 0) { // 上边界 if (y == 0) { // 左上角 if (board[x + 1][y] == BLANK && board[x][y + 1] == BLANK && board[x + 1][y + 1] == BLANK) { return false; } } else if (y == BOARD_SIZE - 1) { // 右上角 if (board[x + 1][y] == BLANK && board[x][y - 1] == BLANK && board[x + 1][y - 1] == BLANK) { return false; } } else { if (board[x][y + 1] == BLANK && board[x][y - 1] == BLANK && board[x + 1][y + 1] == BLANK && board[x + 1][y - 1] == BLANK && board[x + 1][y] == BLANK) { return false; } } } else if (y == 0) { // 左边界 if (x == 0) { // 左上角 if (board[x + 1][y] == BLANK && board[x][y + 1] == BLANK && board[x + 1][y + 1] == BLANK) { return false; } } else if (x == BOARD_SIZE - 1) { // 左下角 if (board[x - 1][y] == BLANK && board[x][y + 1] == BLANK && board[x - 1][y + 1] == BLANK) { return false; } } else { if (board[x + 1][y] == BLANK && board[x - 1][y] == BLANK && board[x + 1][y + 1] == BLANK && board[x - 1][y + 1] == BLANK && board[x][y + 1] == BLANK) { return false; } } } else if (x == BOARD_SIZE - 1) { // 下边界 if (y == 0) { // 左下角 if (board[x - 1][y] == BLANK && board[x][y + 1] == BLANK && board[x - 1][y + 1] == BLANK) { return false; } } else if (y == BOARD_SIZE - 1) { // 右下角 if (board[x][y - 1] == BLANK && board[x - 1][y] == BLANK && board[x - 1][y - 1] == BLANK) { return false; } } else { if (board[x][y + 1] == BLANK && board[x][y - 1] == BLANK && board[x - 1][y + 1] == BLANK && board[x - 1][y - 1] == BLANK && board[x - 1][y] == BLANK) { return false; } } } else if (y == BOARD_SIZE - 1) { // 右边界 if (x == 0) { // 右上角 if (board[x][y - 1] == BLANK && board[x + 1][y] == BLANK && board[x + 1][y - 1] == BLANK) { return false; } } else if (x == BOARD_SIZE - 1) { // 右下角 if (board[x - 1][y] == BLANK && board[x][y - 1] == BLANK && board[x - 1][y - 1] == BLANK) { return false; } } else { if (board[x - 1][y] == BLANK && board[x + 1][y] == BLANK && board[x][y - 1] == BLANK && board[x - 1][y - 1] == BLANK && board[x + 1][y - 1] == BLANK) { return false; } } } else { // 非边界 if (board[x - 1][y - 1] == BLANK && board[x - 1][y] == BLANK && board[x - 1][y + 1] == BLANK && board[x][y - 1] == BLANK && board[x][y + 1] == BLANK && board[x + 1][y - 1] == BLANK && board[x + 1][y] == BLANK && board[x + 1][y + 1] == BLANK) { return false; } } return true; } public static void show() { for (int i = 0; i < BOARD_SIZE; i++) { for (int j = 0; j < BOARD_SIZE; j++) { System.out.printf("%8d", board[i][j]); } System.out.println(); } } public static void showScore() { for (int i = 0; i < BOARD_SIZE; i++) { for (int j = 0; j < BOARD_SIZE; j++) { for (int m = 0; m < BOARD_SIZE; m++) { for (int n = 0; n < BOARD_SIZE; n++) { if (board[m][n] != 0) { score[m][n] = board[m][n]; } } } } } for (int i = 0; i < BOARD_SIZE; i++) { for (int j = 0; j < BOARD_SIZE; j++) { System.out.printf("%8d", score[i][j]); } System.out.println(); } } }