I am reading "Optimal Separating Hyperplane" section of the book - Elements of Statistical Learning which is described on page 132 as follows:
My questions:
The constraint $||\beta|| = 1$ is removed from eq. 4.45 by introducing $1/||\beta||$ in eq. 4.46. How did this happens? I mean what is the mathematical logic for this?
What is the mathematical logic of setting $||\beta|| = 1/M$ in eq. 4.48?