diff --git a/README.md b/README.md
index 8f1c022..dd59bdb 100644
--- a/README.md
+++ b/README.md
@@ -8,12 +8,17 @@ Tic-Tac-Toe, the classic board game, involves two players taking turns to place
- [Play Game](https://tictactoe.romantech.net)
- [Implementation Details Korean Ver.](https://colorfilter.notion.site/TIL-Tic-Tac-Toe-47f5b86f257e484983c08e2fab68d286?pvs=4)
+- [Game Theory Korean Ver.](https://colorfilter.notion.site/TIL-d22a8ba84b2443b8a3d36729a9ecaa19?pvs=4)
## TOC
- [Stack](#stack)
- [Features](#features)
-- [Implementation Details](#implementation-details)
+- [Game Theory](#game-theory)
+ - [Minimax Algorithm](#minimax-algorithm)
+ - [Quick Win, Slow Loss](#quick-win-slow-loss)
+ - [Alpha-Beta Pruning](#alpha-beta-pruning)
+ - [Memoization](#memoization)
- [Screenshots](#screenshots)
## Stack
@@ -32,184 +37,117 @@ Tic-Tac-Toe, the classic board game, involves two players taking turns to place
- Tracking and Review of Gameplay History
- Ability to Undo the Last Move
-## Implementation Details
+## Game Theory
-- [Win Condition Evaluation](#win-condition-evaluation)
- - [Basic](#basic)
- - [Advanced](#advanced)
-- [Finding the Best Move](#finding-the-best-move)
- - [Searching for Winning Positions](#searching-for-winning-positions)
- - [Searching for Defensive Positions](#searching-for-defensive-positions)
+### Minimax Algorithm
-### Win Condition Evaluation
+> [!NOTE]
+> A zero-sum game refers to a game where if one player gains, the other player loses an equal amount.
-#### Basic
+The minimax algorithm is the most widely used algorithm in zero-sum games for two players, like tic-tac-toe or chess, where it's assumed that all players play their best move. It considers all possible moves to derive a winning strategy. The X player aims to score the highest points for a win, while the O player tries to score the least points to avoid losing, finding the optimal solution in this situation.
-![Untitled](https://github.com/romantech/tic-tac-toe/assets/8604840/1e63145d-8f38-4d82-9c27-044b391583c7)
+
-The basic method for determining win conditions involves storing potential winning combinations as indices in a two-dimensional array and matching these against the current board state. With a 3x3 grid, there are eight possible win scenarios—three rows, three columns, and two diagonals. This technique, as outlined in the [React official documentation's Tic-Tac-Toe tutorial](https://react.dev/learn/tutorial-tic-tac-toe#declaring-a-winner), is straightforward but limited to static board sizes and win conditions.
+Assuming it's X player's turn and they can choose from indices 1, 4, 5 (zero-based). If X wins, they get +100 points; if O wins, they get -100 points. If all spaces are filled and it's a draw, it's 0 points.
-```tsx
-function calculateWinner(board: number[]) {
- const lines = [
- [0, 1, 2], // row 1
- [3, 4, 5], // row 2
- [6, 7, 8], // row 3
- [0, 3, 6], // column 1
- [1, 4, 7], // column 2
- [2, 5, 8], // column 3
- [0, 4, 8], // diagonal 1
- [2, 4, 6], // diagonal 2
- ];
+1. If X player chooses index 1 → -100 points
+ - Next turn, O player can choose from indices 4, 5
+ - If O chooses index 4, O wins, scoring -100 points
+ - If O chooses index 5, X wins, scoring +100 points
+ - O player will choose index 4 for the least points
+2. If X player chooses index 4 → +100 points
+3. If X player chooses index 5 → -100 points
+ - Next turn, O player can choose from indices 1, 4
+ - If O chooses index 1, X wins, scoring +100 points
+ - If O chooses index 4, O wins, scoring -100 points
+ - O player will choose index 4 for the least points
- for (const [a, b, c] of lines) {
- const isLineMatch = board?.[a] === board?.[b] && board?.[a] === board?.[c];
- if (isLineMatch) return board[a]; // 'X' | 'O'
- }
+Through this evaluation, X player concludes that choosing index 4 will maximize their score and secure a win. Thus, the minimax algorithm recursively repeats steps of maximizing one's score and minimizing the opponent's score, evaluating all possible moves to select the best position.
- return null;
-}
-```
+### Quick Win, Slow Loss
-This static approach does not account for varying board sizes or dynamic win conditions, making it less versatile for different game configurations.
+#### Quick Win
-#### Advanced
+Let's assume player X can win by choosing either index 0 or 2. Since both indices 0 and 2 have scores of 100, it would ultimately select index 0, the one calculated first. However, choosing index 2 could lead to a victory in fewer turns, making it a better choice than index 0.
-A better approach is to dynamically check for win conditions around the most recently placed symbol, accommodating variable board sizes and win conditions. It assesses potential wins in horizontal, vertical, and diagonal directions from the last move.
+> Board illustration without reflecting turn count in the game score
-Consider a 4x4 board with a piece placed at index 10. The win can be checked around this position in all directions:
+
-- Horizontal: [8, 9, 10, 11]
-- Vertical: [2, 6, 10, 14]
-- Diagonal 1: [0, 5, 10, 15]
-- Diagonal 2: [7, 10, 13]
+By reflecting turn count in the game score, it can encourage selecting moves that lead to a quick win. Since turn count is equivalent to search depth, the calculation during the maximizing phase can be 100 - depth, and during the minimizing phase, depth - 100. When turn count is considered, index 0 scores 97, and index 2 scores 99, leading the parent node to choose index 2 for the highest score.
-![Untitled](https://github.com/romantech/tic-tac-toe/assets/8604840/d8c74755-b367-44e9-a4b0-f192ad7b7059)
+> Board illustration with turn count reflected in the game score
-This flexible approach does not require pre-defined win combinations, facilitating adaptation to various game designs.
+
-First, define a `directions` array outlining the four directional checks needed on the board, each characterized by `deltaRow` and `deltaCol` values to indicate row and column changes when moving from one cell to another.
+#### Slow Loss
-```tsx
-const directions = [
- { deltaRow: 0, deltaCol: 1 }, // Horizontal
- { deltaRow: 1, deltaCol: 0 }, // Vertical
- { deltaRow: 1, deltaCol: 1 }, // Downward diagonal
- { deltaRow: 1, deltaCol: -1 }, // Upward diagonal
-];
-```
+Reflecting turn count in the game score also favors a slower loss. The image below depicts a situation where player X will lose regardless of their move. Without turn count consideration, the scores for indices 0, 2, 3, 4 would all be -100, leading to the selection of index 0, the first calculated. However, by considering turn count, as the game progresses, the opponent's score decreases proportionately, resulting in selecting an index that extends the game as much as possible, thus delaying the loss.
-Using these directions, the game logic examines consecutive cells from the last placed symbol to determine a win, requiring a specified number of matching symbols in one direction to achieve victory.
+> Board illustration delaying loss by reflecting turn count in the score (omitting some possible moves)
-```tsx
-export const checkWinIndexes = (board: TBoard, winCondition: number, linearIndex: number, player: BasePlayer) => {
- // ...
- for (const { deltaRow, deltaCol } of directions) {
- const winningIndexes = checkDirection(/* ... */);
- // ...
- }
- return null;
-};
+
-const checkDirection = (/* ... */) => {
- const searchDirection = (deltaRow: number, deltaCol: number) => {
- const winningIndexes = [];
+In the image, the opponent's symbols O are consecutively placed at indices 5 and 8. A real human player would likely choose index 2 to block and prevent an immediate loss. Reflecting turn count in the score delays the opponent's victory as much as possible, mimicking how a real person strategizes and plays.
- // Start from i = 1 to exclude the last placed position
- for (let i = 1; i < winCondition; i++) {
- const currentRow = i * deltaRow + lastRow;
- const currentCol = i * deltaCol + lastCol;
+### Alpha-Beta Pruning
- if (/* ... */) break; // Stop checking if it goes beyond board range or the symbols don't match
- winningIndexes.push({ row: currentRow, col: currentCol });
- }
+The minimax algorithm predicts the outcome by exploring every node of the game tree, but not all nodes need to be examined. Based on the image below, let's assume nodes A → B → C → C1 have been explored, with the C1 node evaluated at -98 points. Depending on the score of the C2 node, two scenarios are conceivable:
- return winningIndexes;
- };
+
- // Check for consecutive marks in both directions
- const forwardWinningIndexes = searchDirection(deltaRow, deltaCol);
- const backwardWinningIndexes = searchDirection(-deltaRow, -deltaCol);
- // ...
-};
+1. If the C2 node scores (-100) less than C1 (-98):
+ - The minimizing phase node C would choose the C2 (-100) node, making its score -100.
+ - The maximizing phase root node would then select the child node B (99) with the highest score.
+2. If the C2 node scores (100) more than C1 (-98):
+ - The minimizing phase node C would choose the C1 (-98) node, making its score -98.
+ - The maximizing phase root node would then select the child node B (99) with the highest score.
-```
+> Regardless of the C2 node's value, the root node always selects the B node
-This method calculates potential win paths from the last move, offering a scalable solution for various board configurations and win conditions.
+
-### Finding the Best Move
+Consequently, regardless of the C2 node's outcome, the root node will always choose the B node. This means the evaluation result of the C2 node has no impact on the selection of the B node. In such cases, exploring the C2 node becomes unnecessary.
-#### Searching for Winning Positions
+Alpha-beta pruning utilizes this principle to enable the minimax algorithm to skip unnecessary explorations by using alpha (α) and beta (β) values to decide whether to explore remaining child nodes:
-The simplest method to search for winning positions is to fill the empty spaces on the current board with the player's symbol and check whether this position meets the win conditions in horizontal, vertical, and diagonal directions based on that position. This approach is straightforward and has the advantage of being able to reuse the win condition check function written above (`checkWinIndexes` function considers the index it receives as the position where the last move was made).
+- α (Alpha): The maximum score the current player has found.
+ - Initialized with the smallest number, $-\infty$.
+ - Updated only in maximizer nodes.
+- β (Beta): The minimum score the opponent player has found.
+ - Initialized with the largest number, $+\infty$.
+ - Updated only in minimizer nodes.
-```tsx
-const getFirstBestMoveIdx = (board: TBoard, winCondition: number, player: BasePlayer) => {
- const idx = board.findIndex((cell, i) => {
- if (cell.identifier === null) {
- const winningIndexes = checkWinIndexes(board, winCondition, i, player);
- return winningIndexes !== null;
- }
- return false;
- });
+After evaluating the C1 node, the parent node C's minimum score (beta) becomes -98 points, and the maximizing phase root node will always choose the B node (99) if it cannot find a score higher than its current maximum score (alpha) of 99 points. That is, if the calculated beta value at node C (-98) is less than or equal to the alpha value (99), the root node will always choose the B node.
- return idx !== -1 ? idx : null;
-};
+This implies that if the alpha value is greater than or equal to the beta value ($\alpha \geq \beta$) at any node, it can be considered that the optimal solution for the current node has been found, indicating no further exploration of child nodes is needed.
-const findBestMoveIdx = (board: TBoard, winCondition: number, player: BasePlayer) => {
- // ...
- const bestMove = getFirstBestMoveIdx(board, winCondition, player);
- if (bestMove !== null) return bestMove;
- // ...
-};
-```
+Alpha and beta values are updated while exploring child nodes as follows:
-1. Traverse the board to check if the grid is empty.
-2. If the grid is empty, call the `checkWinIndexes` function with the current index and player information.
-3. If it meets the win condition, return the current index; otherwise, return `null`.
+> Process of updating alpha and beta values while exploring child nodes
-#### Searching for Defensive Positions
+
-##### Basic
+1. Root Node: Set initial alpha and beta values → $[-\infty, +\infty]$.
+2. Exploring Node A:
+ - After exploring sub-nodes (omitted in the image) and returning a score of 97 (back to the root node).
+ - The root node, being a maximizer, updates the alpha value from $[-\infty, +\infty]$ to $[97, +\infty]$.
+3. Exploring Node B:
+ - After exploring sub-nodes (omitted in the image) and returning a score of 99 (back to the root node).
+ - The root node, being a maximizer, updates the alpha value from $[97, +\infty]$ to $[99, +\infty]$.
+4. Exploring Node C: Receives alpha and beta values $[99, +\infty]$ from its parent.
+ - After exploring Node C1 and returning a score of -98 (back to node C).
+ - Node C, being a minimizer, updates the beta value from $[99, +\infty]$ to $[99, -98]$.
+ - Since the alpha value (99) is greater than the beta value (-98), it skips exploring the next node (C2) — ✂️ Pruning
-By simply changing the player information, existing logic such as `getFirstBestMoveIdx` can be reused to find defensive positions. Traverse the board using the opponent's symbol to search for places that meet the win conditions, and if a winning spot is found, set it as the defensive position.
+### Memoization
-```tsx
-const findBestMoveIdx = (board: TBoard, winCondition: number, player: BasePlayer) => {
- const opponent = getOpponent(player); // player = 'O' | 'X'
- // ...
- const defenseMove = getFirstBestMoveIdx(board, winCondition, opponent);
- if (defenseMove !== null) return defenseMove;
-};
-```
+During a game of tic-tac-toe, it's common for players to reach the same board state through different sequences of moves (refer to the image below). Especially since the minimax algorithm explores all possible moves, it often re-evaluates board states that have been analyzed previously. By storing and reusing the results of previously computed board states, we can significantly reduce duplicate evaluations and thus greatly improve the algorithm's efficiency.
-##### Advanced
+
-![Untitled](https://github.com/romantech/tic-tac-toe/assets/8604840/49cf7483-2412-4d9b-97ed-09578ecd6ac5)
-
-If the win condition is smaller than the board size (the length of one side of the board), when there are `win condition - 2` consecutive symbols, one must defend one of the ends of this sequence. For example, if the board size is 6 and the win condition is 4, and symbols are placed consecutively at positions 20 and 21, then either position 19 or 22 must be defended. If not defended, the opponent can place their symbol in one of these positions on their next turn to meet the win condition.
-
-```tsx
-const findBestMoveIdx = (board: TBoard, winCondition: number, player: BasePlayer) => {
- // ...
- const minDefenseCondition = winCondition - 2;
- const defenseRange = winCondition - minDefenseCondition + 1;
- const defenseConditions = Array.from({ length: defenseRange }, (_, i) => winCondition - i);
-
- for (const condition of defenseConditions) {
- const defenseMove = getFirstBestMoveIdx(board, condition, opponent);
- if (defenseMove !== null) return defenseMove;
- if (condition === size) break; // If the board size is the same as the win condition, only check for that win condition
- }
- // ...
-};
-```
-
-To defend against the above situation, an array composed of numbers from `win condition - 2` to the `win condition` is created and checked. Each element represents the length of the consecutive spaces that need to be defended. For example, if the win condition is 4, it checks whether placing a symbol in an empty space results in 4 consecutive symbols.
-
-Moreover, to prioritize defending areas with a larger number of consecutive spaces, the conditions array should be created in descending order. If the board size is 6 and the win condition is 4, the `defenseConditions` array would be `[4, 3, 2]`.
-
-If the board size and the win condition are the same, winning is only possible by filling the entire board. Therefore, it is unnecessary to check for win conditions smaller than the board size. Thus, the loop is stopped with the condition `if (condition === size) break;` to avoid unnecessary checks.
+- Path A: X in the center → O in the top left → X in the bottom right
+- Path B: X in the bottom right → O in the top left → X in the center
## Screenshots