Can someone help me to understand the alpha-beta pruning algorithm?

Question

I understand the minimax algorithm, but I am unable to understand deeply the minimax algorithm with alpha-beta pruning, even after having looked up several sources (on the web) and having tried to read the algorithm and understand how it works.

Do you have a good source that explains alpha-beta pruning clearly, or can you help me to understand the alpha-beta pruning (with a simple explanation)?

You can explore how alpha-beta prunes a tree here, but it doesn't animate the steps performed by alpha-beta: https://movingai.com/ab/ — Nathan S., Jan 27 '19 at 07:28
Have you read the following article: http://web.cs.ucla.edu/~rosen/161/notes/alphabeta.html? It should give you some intuition behind the algorithm. — nbro, Jan 27 '19 at 13:58

score 2 · Answer 1 · answered Jan 04 '19 at 12:43

Suppose that you have already search a part of the complete search tree, for example the complete left half. This may not yet give you the true game-theoretic value for the root node, but it can already give you some bounds on the game-theoretic value that the player to play in the root node (let's say, the max player) can guarantee by moving into that part of the search tree. Those bounds / guarantees are:

$\alpha$: the minimum score that the maximizing player already knows it can guarantee if we move into the part of the search tree searched so far. Maybe it can still do better (get higher values) by moving into the unsearched part, but it can already definitely get this value.
$\beta$: the maximum score that the minimizing player already knows it can guarantee if we moves into the part of the search tree searched so far. Maybe it can still do better (get lower values) by moving into the unsearched part, but it can already definitely get this value.

The intuitive idea behind alpha-beta pruning is to prune chunks of the search tree that become uninteresting for either player because they already know they can guarantee better based on the $\alpha$ or $\beta$ bounds.

For a simple example, suppose $\alpha = 1$, which means that the maximizing player already has explored a part of the search tree such that it can guarantee at least a value of $1$ by playing inside that part (the minimizing player has no options inside that entire tree to reduce the value below $1$, if the maximizing player plays optimally in that part).

Suppose that, in the current search process, we have arrived at a node where the minimizing player is to play, and it has a long list of child nodes. We evaluate the first of those children, and find a value of $0$. This means that, under the assumption that we reach this node, the minimizing player can already guarantee a value of $0$ (and possibly get even lower, we didn't evaluate the other children yet). But this is worse (for the maximizing player) than the $\alpha = 1$ bound we already had. Without evaluating any of the other children, we can already tell that this part of the search tree is uninteresting, that the maximizing player would make sure that we never end up here, so we can prune the remaining children (which could each have large subtrees below them).

I think this answer gives some intuition behind the algorithm, but it lacks a few clarifications. For example, you do not explain the reason behind searching with respect to the root player (i.e. because the first player is the one that plays first...). Or what happens if the root player is not the maximizing player but the minimixing player? Also, what happens if one or both players do not act optimality? — nbro, Feb 26 '19 at 09:48
I think these things can confuse people. The idea of pruning parts of the tree because they will not be "explored", if both players act optimality, is clear, but it is not clear which players will play or not, what is the relation between the players, etc. — nbro, Feb 26 '19 at 09:51

Douglas Daseeco · Answer 2 · 2019-01-27T04:17:22.177

Minimax

As the question author may already understand, minimax is an approach before it is an algorithm. The goal is to minimize the advantage of a challenger and maximize the advantage of the participant presently applying the minimax approach. This is a subset of a wider strategy of summing benefits and subtracting the sum of costs to derive a net value.

Alpha-beta Pruning

The strategic goal of alpha beta pruning is to produce uncompromized decision making with less work. This goal is usually driven by the cost of computing resources, the impatience of the person waiting on results, or a missed deadline penalty.

The principle involved is that if there is a finite range to the net gain that can aggregate as a result of traversing any subtree in a search. It can be proven, using the algebra of inequalities, that no knowledge can be lost in the search if the subtree cannot, under any conditions, show any net advantage over the other options.

Some Graph Theory to Clarify Pruning

Vertices are the graph theory name for what are often called tree nodes. They are represented as $\nu$ below, and they correspond to states. The connections between vertices are called edges in graph theory. They are represented as $\epsilon$ below, and they correspond to actions.

The minimax net gain thus far accumulated at an vertex can be calculated as the tree is traversed.

As the limitations to net gain at further depths are applied to these intermediate net gain values, it can be inferred that one edge (action) will, in every case, be advantageous over another edge. If an algorithm reveals that one edge leads to disadvantage in every possible case, then there is no need to traverse that edge. Such dismissal of never-advantageous edges saves computing resources and likely reduces search time. The term pruning is used because edges are like branches in a fruit orchard.

The Condition of Pruning and the Algorithm To Accomplish It

Initialization for the algorithm begins by setting the two Greek letter variables to their worst case values.

$$ \alpha = - \infty \\ \beta = \infty $$

The pruning rule for edge $\epsilon$ leading to a vertex $\nu$ is simply this.

$$ \alpha_\nu \geq \beta_\nu \Rightarrow \text{prune}(\epsilon_\nu) $$

Source code doesn't usually directly implement this declaration, but rather produces its effect in the algorithm, which can be represented in simplified pseudo-code. The algorithm to do such is simple once recursion is understood. That is an conceptual prerequisite, so learn about recursion first if not already known.

ab_minimax ν, α, β

    if ν.depth = search.max_depth or ν.is_leaf
        return ν.value

    if ν.is_maximizing
        for ε in ν.edges
            x = max(β, ab_minimax ε.child, α, β)
            if α ≥ β
                ν.prune ε
                return α
        return α
    else
        for ε in ν.edges
            x = min(α, ab_minimax child, α, β)
            if α ≤ β
                ν.prune ε
                return β
        return β

Can someone help me to understand the alpha-beta pruning algorithm?

2 Answers2