3

For search algorithms with heuristic functions, the performance of heuristic functions are measured by the effective branching factor ${b^*}$, which involves the total number of nodes expanded ${N}$ and the depth of the solution ${d}$.

I'm not able to find out how different values of ${d}$ affect the performance keeping the same ${N}$. Put another way, why not use just the ${N}$ as the performance measure instead of ${b^*}$?

nbro
  • 39,006
  • 12
  • 98
  • 176
KGhatak
  • 165
  • 6

2 Answers2

2

I also walked into that trap the first few times. The difference is the following:

  • $N$ is the number of expanded nodes
  • $b^*$ is the effective branching factor
    • $b^*$ depends on the depth $d$ of the goal and the number of generated nodes, lets call that $M$
    • $b^*$ is the solution to $M+1=1+b^*+(b^*)^2+(b^*)^3+...+(b^*)^d$

So, you could argue that instead of comparing $b_1^*$ and $b_2^*$ of two algorithms, you can also directly compare $M_1$ and $M_2$, because $b_1^*>b_2^*\Leftrightarrow M_1>M_2$.

But you can imagine an algorithm $A_2$ that expands fewer nodes than $A_1$ (so $N_1>N_2$), but also different nodes so that it generates more nodes (so $M_1<M_2$). Since the cost is defined by the number of generated nodes, comparing $N$ might give the wrong result.

The effective branching factor is more general than the number of generated nodes, because you can average $b^*$ for one algorithm over many search problems, but averaging over the number of nodes (which might differ greatly) is not possible or rather nonsensical.

Sentry
  • 136
  • 4
  • Averaging over different sizes of the “same” search problem (e.g., N-queens with different Ns) seems to be the motivation. Otherwise, the number of generated nodes are a superior measure. – HappyFace Apr 25 '22 at 13:18
  • Why is it M+1 in the formula? (Instead of the plain M) – HappyFace Apr 25 '22 at 13:20
  • @HappyFace The "+1" is the root node which is not generated by the search algorithm. It already exists at the beginning. Technically, the "1+" on the right side is also the root node and could be omitted, but with "1+" the right side becomes a [geometric series](https://en.wikipedia.org/wiki/Geometric_series#Closed-form_formula) and can be simplified – Sentry Apr 25 '22 at 20:49
1

As you found $N$ is the number of nodes that are expanded. The cost of expansion of each node is equal to the number of children of that node. Hence, we use $b^*$ for each node. In other words, the total number of nodes that are involved in the expansion process is $N \times b^*$.

OmG
  • 1,731
  • 10
  • 19
  • 1
    @ OmG - Why not ${N}$ since you agreed that the cost solely depends on ${N}$? – KGhatak Nov 24 '19 at 17:03
  • @KGhatak because the total number of expanded node is $N \times b^*$. – OmG Nov 24 '19 at 17:27
  • 1
    @ OmG - Not sure if this is the right understanding. Can you provide some reference for it! Just think, we calculated expected ONLY when we don't know for sure. If we know actual N, there is no need to estimate/guess (aka Expected value of) N. What say! – KGhatak Nov 25 '19 at 04:36
  • @KGhatak Unfortunately I don't know which part is confusing for you. In the search, we expand $N$ nodes, and each node has $b^*$ children at most. Hence, the complexity is related to $N$ and $b^*$. Now, suppose we do not apply $b^*$ in the complexity, and another algorithm try to search and expand $N$ nodes to find the solution, but the branching factor of the current algorithm would be $2^{b^*}$! Which of them will be a better algorithm in terms of complexity? – OmG Nov 25 '19 at 10:22
  • @ OmG - Let me put it in a different way. Two algorithms, say 1 and 2, try to solve a problem. Their respective values are ${N_1}, {b_1^*}, {N_2}$, and ${b_2^*}$ such that ${N_1b_1^*}={N_2b_2^*}$. Which algorithm is efficient given ${N_1>N_2}$. Would you pls explain! – KGhatak Nov 25 '19 at 18:26