The main problem with simply using the values $\alpha \in [0, 2\pi]$ is that semantically $0 = 2\pi$, but numerically $0$ and $2\pi$ are maximally far apart. A common way to encode this is by a vector of $\sin$ and $\cos$. It perfectly conveys the fact that $0 = 2\pi$, because:
$$
\begin{bmatrix}
\sin(0)\\
\cos(0)
\end{bmatrix}
=
\begin{bmatrix}
\sin(2\pi)\\
\cos(2\pi)
\end{bmatrix}
$$
This encoding essentially maps the angle values onto the 2D unit circle. In order to decode this, you can calculate
$$\text{atan}2(a_1, a_2) = \alpha,$$
where $a_1 = \sin(\alpha)$ and $a_2 = \cos(\alpha)$.
Here is a nice detailed explanation and here are two references, where this is applied:
EDIT As it was noted in the comments: The values $\sin(\alpha)$ and $\cos(\alpha)$ are not independent and the following naturally holds: $\sqrt{\sin(\alpha)^2 + \cos(\alpha)^2}= 1$, i.e. the euclidean norm is one. In a situation where your Neural Network predicts the sin and cos values, this condition isn't necessarily true. Therefore, you should consider adding a regularization term to the loss that guides the neural network toward outputting valid values (with unit norm) which could look like this:
$$
r_\lambda\left(\hat{y}_1, \hat{y}_2\right)\; = \lambda \left(\; 1 - \sqrt{\hat{y}_1^2 + \hat{y}_2^2}\right),
$$
where $\hat{y}_1$ and $\hat{y}_2$ are the sin and cos outputs of the network respectively and $\lambda$ is a scalar that weights the regularization term against the loss. I found this paper where such a regularization term is used (s. Sec. 3.2) to get valid quaternions (Quaternions must also have unit norm). They found that many values work for $\lambda$ and they settle for $\lambda = 0.1$