1

We can enforce some constraints on functions used in deep learning in order to guarantee optimizations. You can find it in Numerical Computation of the deep learning book.

In the context of deep learning, we sometimes gain some guarantees by restricting ourselves to functions that are either Lipschitz continuous or have Lipschitz continuous derivatives.

They include

  1. Lipschitz continuous functions
  2. Having Lipschitz continuous derivatives

The definition given for Lipschitz continuous function is as follows

A Lipschitz continuous function is a function $f$ whose rate of change is bounded by a Lipschitz constant $\mathcal{L}$:

$$\forall x, \forall y, |f(x)-f(y)| \le \mathcal{L} \|x-y\|_2 $$

Now, what is meant by having Lipschitz continuous derivatives?

Does they refer to the derivatives of Lipschitz continuous functions? If yes, then why do they mention it as a separate option?

hanugm
  • 3,571
  • 3
  • 18
  • 50

2 Answers2

1

Consider a function $f(x) : \mathcal{R}^m\rightarrow\mathcal{R}^n$ defined for $x \in X$. If $f$ is Lipschitz continuous, it has three main properties:

  1. $f(x)$ is continuous for all $x \in X$
  2. $\frac{d f(x)}{d x}$ exists almost everywhere. Meaning, if the derivative is not defined for $x \in \mathcal{B}$, where the set $\mathcal{B} \subset X$, then $\mathcal{B}$ has measure zero.
  1. $\underset{x \in X}{\sup} \left\lVert \frac{d f(x)}{d x} \right\rVert_2 \leq L$, where $L$ is the Lipschitz constant, and the norm indicates the induced matrix norm (or if $f$ is scalar, just the regular 2 norm).

So it follows that a Lipschitz continuous function is continuous and has a bounded jacobian.

Now if $f$ has a lipschitz continuous derivative, then it means $\frac{d f}{d x}$ is Lipschitz continious, i.e. \begin{align*} \left\lVert \dfrac{d f}{d x}\big|_{x = s} - \dfrac{d f}{d x}\big|_{x = t} \right\rVert_2 \leq M \left\lVert s - t \right\rVert_2 \quad s, t \in X \end{align*} where $M$ is the lipschitz constant. So a function with Lipschitz continuous gradient is continuously differentiable and has a bounded hessian.

Taw
  • 1,161
  • 3
  • 10
0

Answer: A function $f$ 'Having a Lipschitz continuous derivative' is simply when the derivative of $f$ is Lipschitz continuous. So it is not the derivative of a Lipschitz continuous function.

Your book states that 'Lipschitz continuous derivatives' as plural, meaning that all derivatives of $f$ should be Lipschitz continuous.

Intuition on Lipschitz continuity: Lipschitz continuity is a property of functions that measures how much the output of a function can change for a given change in the input. A function is Lipschitz continuous if there exists a constant, called the Lipschitz constant, such that the difference between the output of the function for two input points is bounded by the product of the Lipschitz constant and the distance between the two input points.

$$d_{Y}(f(x_{1}),f(x_{2}))\leq Kd_{X}(x_{1},x_{2})$$ where $K$ is the Lipschitz constant.

The wiki on Lipschitz continuity gives a nice visualization that might help understand the concept.

Robin van Hoorn
  • 1,810
  • 7
  • 32