It is recommended to apply gradient clipping by normalization in case of exploding gradients. The following quote is taken from here answer
One way to assure it is exploding gradients is if the loss is unstable and not improving, or if loss shows NaN value during training.
Apart from the usual gradient clipping and weights regularization that are recommended...
But I want to know the effect of gradient clipping by normalization in the performance of the model in normal or general cases.
Suppose I have a model and I run up to 800 epochs without gradient clipping because of the reason that there are no exploding gradients. If I run the same model with gradient clipping by norm, even if it is not necessary, then does the performance of the model decline?