In section 3 of the paper Continuous control with deep reinforcement learning, the authors write
As detailed in the supplementary materials we used an Ornstein-Uhlenbeck process (Uhlenbeck & Ornstein, 1930) to generate temporally correlated exploration for exploration efficiency in physical control problems with inertia (similar use of autocorrelated noise was introduced in (Wawrzynski, 2015)).
In section 7, they write
For the exploration noise process we used temporally correlated noise in order to explore well in physical environments that have momentum. We used an Ornstein-Uhlenbeck process (Uhlenbeck & Ornstein, 1930) with θ = 0.15 and σ = 0.2. The Ornstein-Uhlenbeck process models the velocity of a Brownian particle with friction, which results in temporally correlated values centered around 0.
In a few words, what is the Ornstein-Uhlenbeck process? How does it work? How exactly is it used in DDPG?
I want to implement the Deep Deterministic Policy Gradient algorithm, and, in the initial actions, noise has to be added. However, I cannot understand how this Ornstein-Uhlenbeck process works. I have searched the internet, but I have not understood the information that I found.