Notably, these two tips/tricks are useful because we are assuming the context of deep reinforcement learning here, as you pointed out. In DRL, the RL algorithm is guided in some fashion by a deep neural network, and the reasons for normalizing stem from the gradient descent algorithm and the architecture of the network.
How does this affect training?
An observation from the observation space is often used as an input to a neural network in DRL algorithms, and normalizing the input to neural networks is beneficial for many reasons (e.g. increases convergence speed, aids computer precision, prevents divergence of parameters, allows for easier hyperparameter tuning, etc.). These are standard results in DL theory and practice, so I won't provide details here.
And more specifically, why on continuous action spaces we need to normalize also the action's values?
Most popular discrete action space DRL algorithms (e.g. DQN) have one output node for each possible action in the neural net. The value of the output node may be a q-value (value-based algorithm) or a probability of taking that action (policy-based algorithm).
In contrast, a continuous action space DRL algorithm simply cannot have an output node for each possible action, as the action space is continuous. The output is usually the actual action to be taken by the agent or some parameters that could be used to construct the action (e.g. PPO outputs a mean and standard deviation and then an action is sampled from the corresponding Gaussian distribution - this phenomenon is mentioned in your linked reference). Therefore, normalizing the action space of a DRL algorithm is analogous to normalizing the outputs of the corresponding neural network, which is known to increase training speed and prevent divergence. Again, a quick search will yield some good resources if you are interested in these results.