{"nbformat":4,"nbformat_minor":0,"metadata":{"kernelspec":{"display_name":"Python 3","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.6.1"},"colab":{"name":"PB016_lab5_G01.ipynb","provenance":[],"collapsed_sections":["Zk7LjHUh1xFg","R9qEXD5Y1xFp"]}},"cells":[{"cell_type":"markdown","metadata":{"id":"1cKYLTEE1xFO"},"source":["# PB016: Artificial intelligence I, lab 5 - Games and game strategies\n","\n","This week's topic are games, game strategies and basic algorithms for optimal game solutions using AI. We'll focus namely on:\n","\n","1. __Minimax algorithm__\n","2. __Alpha-beta pruning__"]},{"cell_type":"markdown","metadata":{"id":"vL-HOkJH1xFQ"},"source":["---\n","\n","## 1. [Minimax](https://en.wikipedia.org/wiki/Minimax) algorithm\n","\n","__Basic facts__\n","- A concept originally based on [game theory](https://en.wikipedia.org/wiki/Game_theory).\n","- Designed for games of two or more alternating players, each with a set of strategies for each individual move in the game.\n","- The goal states of the game are evaluated by a valuation function, which assigns their corresponding gain values to each player.\n","- The minimax algorithm recursively minimizes a possible loss of a player in the worst possible scenario (i.e. if the opponent tries to reach their maximum loss in each turn by choosing the optimal strategy).\n","- In simple games, a complete evaluation of the game is possible, but for more complex games a combinatorial explosion occurs quickly and the algorithm therefore searches only a few levels of the tree of possible moves at a time.\n","\n","__Example__ - a sample of a general minimax tree:\n","\n","![minimax tree](https://www.fi.muni.cz/~novacek/courses/pb016/labs/img/minimax.png)"]},{"cell_type":"markdown","metadata":{"id":"NqbAtjo-1xFR"},"source":["### Game for these labs\n","- [Tic Tac Toe](https://en.wikipedia.org/wiki/Tic-tac-toe) on a 3x3 playing board.\n","\n","An __example__ of unlabeled game tree for Tic Tac Toe:\n","\n","![tictactoe tree](https://www.fi.muni.cz/~novacek/courses/pb016/labs/img/tictactoe.png)"]},{"cell_type":"markdown","metadata":{"id":"3nQ5z58-1xFS"},"source":["### Basic game code\n","- a class representing the playing board with functions for:\n","  - initialization of the playing board,\n","  - verification that the current state of the playing area is a goal one,\n","  - verification of valid moves,\n","  - drawing the playing board."]},{"cell_type":"code","metadata":{"scrolled":true,"id":"5joYwEmv1xFT"},"source":["# to measure the time required to evaluate the game using minimax and alpha-beta pruning\n","import time\n","\n","# class definition for game representation\n","class Game:\n","    def __init__(self):\n","        self.initialize_game()\n","\n","    # function usable both for initialization and for resetting the game\n","    def initialize_game(self):\n","        self.current_state = [['.','.','.'],\n","                              ['.','.','.'],\n","                              ['.','.','.']]\n","        self.result = None\n","        # player X always plays first\n","        self.player_turn = 'X'\n","\n","    # end of game test - if yes, returns the winner ('X' or 'O'), or '.', if it is a draw;\n","    #                    if not, returns None\n","    def is_end(self):\n","        # vertical win\n","        for i in range(0, 3):\n","            if (self.current_state[0][i] != '.' and\n","                self.current_state[0][i] == self.current_state[1][i] and\n","                self.current_state[1][i] == self.current_state[2][i]):\n","                return self.current_state[0][i]\n","\n","        # horizontal win\n","        for i in range(0, 3):\n","            if (self.current_state[i] == ['X', 'X', 'X']):\n","                return 'X'\n","            elif (self.current_state[i] == ['O', 'O', 'O']):\n","                return 'O'\n","\n","        # main diagonal win\n","        if (self.current_state[0][0] != '.' and\n","            self.current_state[0][0] == self.current_state[1][1] and\n","            self.current_state[0][0] == self.current_state[2][2]):\n","            return self.current_state[0][0]\n","\n","        # secondary diagonal win\n","        if (self.current_state[0][2] != '.' and\n","            self.current_state[0][2] == self.current_state[1][1] and\n","            self.current_state[0][2] == self.current_state[2][0]):\n","            return self.current_state[0][2]\n","\n","        # testing whether the board if full\n","        for i in range(0, 3):\n","            for j in range(0, 3):\n","                # not full, no end yet\n","                if (self.current_state[i][j] == '.'):\n","                    return None\n","\n","        # draw\n","        return '.'\n","\n","    # testing move's validity\n","    def is_valid(self, px, py):\n","        if px < 0 or px > 2 or py < 0 or py > 2:\n","            return False\n","        elif self.current_state[px][py] != '.':\n","            return False\n","        else:\n","            return True\n","    \n","    # drawing the board\n","    def draw_board(self):\n","        for i in range(0, 3):\n","            print(' | '.join(self.current_state[i]))\n","        print()"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"qDEXImrd1xFY"},"source":["### Game play functions\n","- A cycle of alternating moves of the `'X'` and`' O'` players until one of them wins, or until they draw."]},{"cell_type":"code","metadata":{"id":"teR0fhJg1xFZ"},"source":["def play(game):\n","    while True:\n","        game.draw_board()\n","        game.result = game.is_end()\n","\n","        # printing the relevant message at the end of the game\n","        if game.result != None:\n","            if game.result == 'X':\n","                print('The winner is X!')\n","            elif game.result == 'O':\n","                print('The winner is O!')\n","            elif game.result == '.':\n","                print('Draw!')\n","\n","            game.initialize_game()\n","            return\n","\n","        # human player's move\n","        if game.player_turn == 'X':\n","\n","            while True:\n","\n","                # calculate the optimal recommended move in a minimization step\n","                print('Calculating the recommended optimal move...')\n","                start = time.time()\n","                (m, qx, qy) = mini(game)\n","                end = time.time()\n","                print('Calculation took %.4f ms' % (1000*(end - start),))\n","                print('Recommended move: X = %d, Y = %d' % (qx, qy))\n","\n","                correct_format, px, py = False, -1, -1\n","                while not correct_format:\n","                    try:\n","                        px = int(input('Enter the X coordinate: '))\n","                        py = int(input('Enter the Y coordinate: '))\n","                        correct_format = True\n","                    except ValueError:\n","                        print('Invalid format, please repeat.')\n","                        correct_format = False\n","\n","                if game.is_valid(px, py):\n","                    game.current_state[px][py] = 'X'\n","                    game.player_turn = 'O'\n","                    break\n","                else:\n","                    print('Invalid move, please repeat.')\n","\n","        # AI move\n","        else:\n","            (m, px, py) = maxi(game)\n","            game.current_state[px][py] = 'O'\n","            game.player_turn = 'X'"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"WhmPih6S1xFc"},"source":["### __Exercise 1.1: AI move selection__\n","- Implement a function to select the optimal move for the AI (i.e. maximize the goal value of the game from the current position)."]},{"cell_type":"code","metadata":{"id":"4skua5Zf1xFd"},"source":["# player 'O' maximizes (in this case it's the AI)\n","def maxi(game):\n","\n","    # possible values for maximum are:\n","    # -1 - defeat\n","    # 0 - draw\n","    # 1 - win\n","\n","    # initial maximum set to -2 (worse than worst case)\n","    maxv = -2\n","\n","    px = None\n","    py = None\n","\n","    # testing whether the game's over\n","    result = game.is_end()\n","\n","    # if the game ends, the function must return the evaluation of the given state (-1 for a loss, 0 for a draw, 1 for a win)\n","    if result == 'X':\n","        return (-1, None, None)\n","    elif result == 'O':\n","        return (1, None, None)\n","    elif result == '.':\n","        return (0, None, None)\n","\n","    # selection of the move coordinates for 'O' by testing the optimality of possible moves (i.e. taking the mini function result into account)\n","\n","    # TODO - COMPLETE YOURSELVES\n","    \n","    # we return the value and coordinates of the optimal move\n","    return (maxv, px, py)"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"Zk7LjHUh1xFg"},"source":["#### __Possible implementation of the AI move selection__"]},{"cell_type":"code","metadata":{"id":"zo7_aKxJ1xFh"},"source":["# player 'O' maximizes (in this case it's the AI)\n","def maxi(game):\n","\n","    # possible values for maximum are:\n","    # -1 - defeat\n","    # 0 - draw\n","    # 1 - win\n","\n","    # initial maximum set to -2 (worse than worst case)\n","    maxv = -2\n","\n","    px = None\n","    py = None\n","\n","    # testing whether the game's over\n","    result = game.is_end()\n","\n","    # if the game ends, the function must return the evaluation of the given state (-1 for a loss, 0 for a draw, 1 for a win)\n","    if result == 'X':\n","        return (-1, None, None)\n","    elif result == 'O':\n","        return (1, None, None)\n","    elif result == '.':\n","        return (0, None, None)\n","\n","    # selection of the move coordinates for 'O' by testing the optimality of possible moves (i.e. taking the mini function result into account)\n","    for i in range(0, 3):\n","        for j in range(0, 3):\n","            # we try a move if the field is empty\n","            if game.current_state[i][j] == '.':\n","                # we try to place 'O' and test whether the given move is better than the current optimal one\n","                game.current_state[i][j] = 'O'\n","                (m, min_i, min_j) = mini(game)\n","                # possibly updating the optimal move and the current maximum\n","                if m > maxv:\n","                    maxv = m\n","                    px = i\n","                    py = j\n","                # resetting the field to an empty one\n","                game.current_state[i][j] = '.'\n","    \n","    # we return the value and coordinates of the optimal stroke\n","    return (maxv, px, py)"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"2UKjevOT1xFk"},"source":["### __Exercise 1.2: Simulating the human move selection__\n","- Implement a function for selecting the optimal move for a human (i.e. minimizing the goal value of the game from the current position)."]},{"cell_type":"code","metadata":{"id":"qaaxf9Zq1xFl"},"source":["# player 'X' minimizes (in this case it's the human)\n","def mini(game):\n","\n","    # possible values for minimum are:\n","    # -1 - win\n","    # 0 - draw\n","    # 1 - defeat\n","\n","    # initial minimum set to 2 (worse than worst case)\n","    minv = 2\n","\n","    qx = None\n","    qy = None\n","\n","    # testing whether the game's over\n","    result = game.is_end()\n","\n","    # if the game ends, the function must return the evaluation of the given state (-1 for a win, 0 for a draw, 1 for a loss)\n","    if result == 'X':\n","        return (-1, None, None)\n","    elif result == 'O':\n","        return (1, None, None)\n","    elif result == '.':\n","        return (0, None, None)\n","\n","    # selection of the move coordinates for 'X' by testing the optimality of possible moves (i.e. taking the maxi function result into account)\n","\n","    # TODO - COMPLETE YOURSELVES\n","    \n","    # we return the value and coordinates of the optimal move\n","    return (minv, qx, qy)"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"R9qEXD5Y1xFp"},"source":["#### __Possible implementation of choosing the human's move__"]},{"cell_type":"code","metadata":{"id":"dOjTKvpw1xFp"},"source":["# player 'X' minimizes (in this case it's the human)\n","def mini(game):\n","\n","    # possible values for minimum are:\n","    # -1 - win\n","    # 0 - draw\n","    # 1 - defeat\n","\n","    # initial minimum set to 2 (worse than worst case)\n","    minv = 2\n","\n","    qx = None\n","    qy = None\n","\n","    # testing whether the game's over\n","    result = game.is_end()\n","\n","    # if the game ends, the function must return the evaluation of the given state (-1 for a win, 0 for a draw, 1 for a loss)\n","    if result == 'X':\n","        return (-1, None, None)\n","    elif result == 'O':\n","        return (1, None, None)\n","    elif result == '.':\n","        return (0, None, None)\n","\n","    # selection of the move coordinates for 'X' by testing the optimality of possible moves (i.e. taking the maxi function result into account)\n","    for i in range(0, 3):\n","        for j in range(0, 3):\n","            # we try a move if the field is empty\n","            if game.current_state[i][j] == '.':\n","                # we try to place 'X' and test whether the given move is better than the current optimal one\n","                game.current_state[i][j] = 'X'\n","                (m, max_i, max_j) = maxi(game)\n","                # possibly updating the optimal move and the current maximum\n","                if m < minv:\n","                    minv = m\n","                    qx = i\n","                    qy = j\n","                game.current_state[i][j] = '.'\n","\n","    # we return the value and coordinates of the optimal move\n","    return (minv, qx, qy)"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"qDLcXBvQ1xFs"},"source":["### __Let's play!__"]},{"cell_type":"code","metadata":{"scrolled":true,"id":"LtcvuAkJ1xFt","executionInfo":{"status":"error","timestamp":1604312996863,"user_tz":-60,"elapsed":999285,"user":{"displayName":"Vít Nováček","photoUrl":"","userId":"05686630748873346736"}},"outputId":"389d4ca0-a7cf-4837-8d93-f631e1225ade","colab":{"base_uri":"https://localhost:8080/","height":1000}},"source":["g = Game()\n","play(g)"],"execution_count":null,"outputs":[{"output_type":"stream","text":[". | . | .\n",". | . | .\n",". | . | .\n","\n","Calculating the recommended optimal move...\n","Calculation took 2270.2253 ms\n","Recommended move: X = 0, Y = 0\n","Enter the X coordinate: 0\n","Enter the Y coordinate: 0\n","X | . | .\n",". | . | .\n",". | . | .\n","\n","X | . | .\n",". | O | .\n",". | . | .\n","\n","Calculating the recommended optimal move...\n","Calculation took 30.3857 ms\n","Recommended move: X = 0, Y = 1\n","Enter the X coordinate: 0\n","Enter the Y coordinate: 1\n","X | X | .\n",". | O | .\n",". | . | .\n","\n","X | X | O\n",". | O | .\n",". | . | .\n","\n","Calculating the recommended optimal move...\n","Calculation took 0.8857 ms\n","Recommended move: X = 2, Y = 0\n"],"name":"stdout"},{"output_type":"error","ename":"KeyboardInterrupt","evalue":"ignored","traceback":["\u001b[0;31m---------------------------------------------------------------------------\u001b[0m","\u001b[0;31mKeyboardInterrupt\u001b[0m                         Traceback (most recent call last)","\u001b[0;32m/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py\u001b[0m in \u001b[0;36m_input_request\u001b[0;34m(self, prompt, ident, parent, password)\u001b[0m\n\u001b[1;32m    728\u001b[0m             \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 729\u001b[0;31m                 \u001b[0mident\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mreply\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msession\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrecv\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mstdin_socket\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    730\u001b[0m             \u001b[0;32mexcept\u001b[0m \u001b[0mException\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n","\u001b[0;32m/usr/local/lib/python3.6/dist-packages/jupyter_client/session.py\u001b[0m in \u001b[0;36mrecv\u001b[0;34m(self, socket, mode, content, copy)\u001b[0m\n\u001b[1;32m    802\u001b[0m         \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 803\u001b[0;31m             \u001b[0mmsg_list\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0msocket\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrecv_multipart\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmode\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcopy\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcopy\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    804\u001b[0m         \u001b[0;32mexcept\u001b[0m \u001b[0mzmq\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mZMQError\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0me\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n","\u001b[0;32m/usr/local/lib/python3.6/dist-packages/zmq/sugar/socket.py\u001b[0m in \u001b[0;36mrecv_multipart\u001b[0;34m(self, flags, copy, track)\u001b[0m\n\u001b[1;32m    490\u001b[0m         \"\"\"\n\u001b[0;32m--> 491\u001b[0;31m         \u001b[0mparts\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m[\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrecv\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mflags\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mcopy\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mcopy\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtrack\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mtrack\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    492\u001b[0m         \u001b[0;31m# have first part already, only loop while more to receive\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n","\u001b[0;32mzmq/backend/cython/socket.pyx\u001b[0m in \u001b[0;36mzmq.backend.cython.socket.Socket.recv\u001b[0;34m()\u001b[0m\n","\u001b[0;32mzmq/backend/cython/socket.pyx\u001b[0m in \u001b[0;36mzmq.backend.cython.socket.Socket.recv\u001b[0;34m()\u001b[0m\n","\u001b[0;32mzmq/backend/cython/socket.pyx\u001b[0m in \u001b[0;36mzmq.backend.cython.socket._recv_copy\u001b[0;34m()\u001b[0m\n","\u001b[0;32m/usr/local/lib/python3.6/dist-packages/zmq/backend/cython/checkrc.pxd\u001b[0m in \u001b[0;36mzmq.backend.cython.checkrc._check_rc\u001b[0;34m()\u001b[0m\n","\u001b[0;31mKeyboardInterrupt\u001b[0m: ","\nDuring handling of the above exception, another exception occurred:\n","\u001b[0;31mKeyboardInterrupt\u001b[0m                         Traceback (most recent call last)","\u001b[0;32m<ipython-input-5-4fc391e46178>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[0mg\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mGame\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mplay\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mg\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m","\u001b[0;32m<ipython-input-2-6b3f91c3993b>\u001b[0m in \u001b[0;36mplay\u001b[0;34m(game)\u001b[0m\n\u001b[1;32m     32\u001b[0m                 \u001b[0;32mwhile\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0mcorrect_format\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     33\u001b[0m                     \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 34\u001b[0;31m                         \u001b[0mpx\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Enter the X coordinate: '\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m     35\u001b[0m                         \u001b[0mpy\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0minput\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'Enter the Y coordinate: '\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     36\u001b[0m                         \u001b[0mcorrect_format\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;32mTrue\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n","\u001b[0;32m/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py\u001b[0m in \u001b[0;36mraw_input\u001b[0;34m(self, prompt)\u001b[0m\n\u001b[1;32m    702\u001b[0m             \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_parent_ident\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    703\u001b[0m             \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_parent_header\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 704\u001b[0;31m             \u001b[0mpassword\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;32mFalse\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    705\u001b[0m         )\n\u001b[1;32m    706\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n","\u001b[0;32m/usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py\u001b[0m in \u001b[0;36m_input_request\u001b[0;34m(self, prompt, ident, parent, password)\u001b[0m\n\u001b[1;32m    732\u001b[0m             \u001b[0;32mexcept\u001b[0m \u001b[0mKeyboardInterrupt\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    733\u001b[0m                 \u001b[0;31m# re-raise KeyboardInterrupt, to truncate traceback\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 734\u001b[0;31m                 \u001b[0;32mraise\u001b[0m \u001b[0mKeyboardInterrupt\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    735\u001b[0m             \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    736\u001b[0m                 \u001b[0;32mbreak\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n","\u001b[0;31mKeyboardInterrupt\u001b[0m: "]}]},{"cell_type":"markdown","metadata":{"id":"f6j254Jh1xFx"},"source":["---\n","\n","## 2. [Alpha-beta pruning](https://en.wikipedia.org/wiki/Alpha%E2%80%93beta_pruning)\n","\n","__Basic facts__\n","- Optimization of the minimax algorithm thanks to the determination of nodes (or moves) in the game tree that don't need to be searched further.\n","- The alpha-beta pruning algorithm stores two values, $\\alpha$ and $\\beta$, which represent:\n","  - the minimum score guaranteed by the maximizing player,\n","  - The maximum score guaranteed by the minimizing player\n","- At the beginning of the game, $\\alpha = - \\infty, \\beta = \\infty $ applies (i.e. both players start with their worst possible score).\n","- Whenever the minimizing player's maximum guaranteed score (\"beta\") becomes less than the maximizing player's minimum guaranteed score (\"alpha\") (i.e. $\\beta < \\alpha$), the maximizing player does not have to explore moves based on the current node because it's clear they don't represent the optimal strategy and will not be achieved in the game.\n","\n","__Example__ - a pruned minimax tree from the previous example:\n","\n","![alfa-beta tree](https://www.fi.muni.cz/~novacek/courses/pb016/labs/img/alphabeta.png)"]},{"cell_type":"markdown","metadata":{"id":"8m6HPk1R1xFx"},"source":["### Modified game play function\n","- A cycle of alternating moves of `'X'` and`' O'` players until one of them wins, or until a draw occurs.\n","- The same procedure as before, only uses versions of the `mini` and` maxi` functions extended by alpha-beta pruning.\n","- Instead of initial infinite values for $\\alpha, \\beta$, we use $-2, 2$ (for the Tic Tac Toe game defined in this way, this means de facto the same)."]},{"cell_type":"code","metadata":{"id":"vAvdUS_j1xFy"},"source":["def play_alpha_beta(game):\n","    while True:\n","        game.draw_board()\n","        game.result = game.is_end()\n","\n","        if game.result != None:\n","            if game.result == 'X':\n","                print('The winner is X!')\n","            elif game.result == 'O':\n","                print('The winner is O!')\n","            elif game.result == '.':\n","                print('Draw!')\n","\n","\n","            game.initialize_game()\n","            return\n","\n","        if game.player_turn == 'X':\n","\n","            while True:\n","                print('Calculating the recommended optimal move...')\n","                start = time.time()\n","                # updated mini function with pruning\n","                (m, qx, qy) = mini_alpha_beta(game,-2, 2)\n","                end = time.time()\n","                print('Calculation took %.4f ms' % (1000*(end - start),))\n","                print('Recommended move: X = %d, Y = %d' % (qx, qy))\n","\n","                correct_format, px, py = False, -1, -1\n","                while not correct_format:\n","                    try:\n","                        px = int(input('Enter the X coordinate: '))\n","                        py = int(input('Enter the Y coordinate: '))\n","                        correct_format = True\n","                    except ValueError:\n","                        print('Invalid format, please repeat.')\n","                        correct_format = False\n","\n","                if game.is_valid(px, py):\n","                    game.current_state[px][py] = 'X'\n","                    game.player_turn = 'O'\n","                    break\n","                else:\n","                    print('Invalid move, please repeat.')\n","\n","        else:\n","            # updated maxi function with pruning\n","            (m, px, py) = maxi_alpha_beta(game,-2, 2)\n","            game.current_state[px][py] = 'O'\n","            game.player_turn = 'X'"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"EYRwU5KV1xF1"},"source":["### __Exercise 2.1: Pruned move selection for the AI__\n","- Implement a function to select the optimal move for the AI (i.e. maximize the end value of the game from the current position).\n","- This time, however, prune the tree of possible moves to explore."]},{"cell_type":"code","metadata":{"id":"uluGdHyv1xF1"},"source":["def maxi_alpha_beta(game, alpha, beta):\n","    maxv = -2\n","    px = None\n","    py = None\n","\n","    result = game.is_end()\n","\n","    if result == 'X':\n","        return (-1, None, None)\n","    elif result == 'O':\n","        return (1, None, None)\n","    elif result == '.':\n","        return (0, None, None)\n","\n","    # TODO - COMPLETE YOURSELF (select optimal move with alpha update and ignore branches with too large maxv)\n","\n","    return (maxv, px, py)"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"AebZBtRV1xF4"},"source":["#### __Possible implementation of the AI move selection__"]},{"cell_type":"code","metadata":{"id":"UmSJXlfj1xF5"},"source":["def maxi_alpha_beta(game, alpha, beta):\n","    maxv = -2\n","    px = None\n","    py = None\n","\n","    result = game.is_end()\n","\n","    if result == 'X':\n","        return (-1, None, None)\n","    elif result == 'O':\n","        return (1, None, None)\n","    elif result == '.':\n","        return (0, None, None)\n","\n","    for i in range(0, 3):\n","        for j in range(0, 3):\n","            if game.current_state[i][j] == '.':\n","                game.current_state[i][j] = 'O'\n","                (m, min_i, min_j) = mini_alpha_beta(game, alpha, beta)\n","                if m > maxv:\n","                    maxv = m\n","                    px = i\n","                    py = j\n","                game.current_state[i][j] = '.'\n","\n","                # the next two conditions ensure pruning the move tree (the only difference compared to the previous version)\n","                if alpha >= beta:\n","                    return (maxv, px, py)\n","\n","                if maxv > alpha:\n","                    alpha = maxv\n","\n","    return (maxv, px, py)"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"N6-ns3ri1xF8"},"source":["### __Example 2.2: Simulation of human move selection with pruning__\n","- Implement a function for selecting the optimal move for the human (i.e. minimizing the goal value of the game from the current position).\n","- This time, however, prune the tree of possible moves to explore."]},{"cell_type":"code","metadata":{"id":"nrVT4mEF1xF9"},"source":["def mini_alpha_beta(game, alpha, beta):\n","\n","    minv = 2\n","\n","    qx = None\n","    qy = None\n","\n","    result = game.is_end()\n","\n","    if result == 'X':\n","        return (-1, None, None)\n","    elif result == 'O':\n","        return (1, None, None)\n","    elif result == '.':\n","        return (0, None, None)\n","\n","    # TODO - COMPLETE YOURSELF (select optimal move with beta update and ignore branches with too small minv)\n","\n","    return (minv, qx, qy)"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"uGx3GMLR1xGA"},"source":["#### __Possible implementation of choosing the human's move__"]},{"cell_type":"code","metadata":{"id":"w75mZALh1xGA"},"source":["def mini_alpha_beta(game, alpha, beta):\n","\n","    minv = 2\n","\n","    qx = None\n","    qy = None\n","\n","    result = game.is_end()\n","\n","    if result == 'X':\n","        return (-1, None, None)\n","    elif result == 'O':\n","        return (1, None, None)\n","    elif result == '.':\n","        return (0, None, None)\n","\n","    for i in range(0, 3):\n","        for j in range(0, 3):\n","            if game.current_state[i][j] == '.':\n","                game.current_state[i][j] = 'X'\n","                (m, max_i, max_j) = maxi_alpha_beta(game, alpha, beta)\n","                if m < minv:\n","                    minv = m\n","                    qx = i\n","                    qy = j\n","                game.current_state[i][j] = '.'\n","\n","                # the next two conditions ensure pruning the move tree (the only difference compared to the previous version)\n","                if beta <= alpha:\n","                    return (minv, qx, qy)\n","\n","                if minv < beta:\n","                    beta = minv\n","\n","    return (minv, qx, qy)"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"BPSfloFq1xGE"},"source":["### __Let's play!__"]},{"cell_type":"code","metadata":{"scrolled":true,"id":"-Cb-6H8-1xGF"},"source":["g = Game()\n","play_alpha_beta(g)"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"6MmpG84S1xGI"},"source":["### __Food for thought__\n","- How to modify the exercise code for larger playing areas, or a more difficult winning condition (i.e. a requirement to reach 4 or more fields in a row)?\n","- How will these changes affect the exploration efficiency?"]},{"cell_type":"markdown","metadata":{"collapsed":true,"id":"d1nO-4gZ1xGI"},"source":["---\n","\n","#### _Final note_ - the materials used in this notebook are original works adapted from original works as follows:\n","- Examples of game trees and code:\n","  - Based on materials from the [Stack Abuse](https://stackabuse.com/) site\n","  - Author: [Mina Krivokuća](mina.krivokuca@gmail.com)\n","  - License: N/A (adapted for internal use in PB016 at FI MU with the kind permission of the author and David Landup, the StackAbuse operator)"]}]}