0

Consider the following paragraph from Numerical Computation of deep learning book that says derivative as a slope of the function curve at a point

Suppose we have a function $y= f(x)$, where both $x$ and $y$ are real numbers. The derivative of this function is denoted as $f'(x)$ or as $\dfrac{dy}{dx}$ . The derivative $f'(x)$ gives the slope of $f(x)$ at the point $x$. In other words, it specifies how to scale a small change in the input to obtain the corresponding change in the output: $f(x+ \epsilon) \approx f(x)+\epsilon f'(x)$.

Slope of a function $f(x)$ at a point $a$ is generally defined as the $\tan$ of the angle made by the tangent line at the point $a$ on the curve of the function $f(x)$ to the positive x-axis in anti-clock wise direction. That is, if $\theta$ is the angle made by the tangent of the curve $f(x)$ at a point $(a, f(a))$ to the positive x-axis in anti-clock wise direction. Then the slope of $f(x)$ at point $a$ is $\tan \theta$.

In theory, tangent line should touch the curve of $f(x)$ at a single point only. Most of the textbooks draws nice convex curves and then show slope as $\tan \theta$. But, i think it is not possible for many functions to draw a tangent line at a point that touches the curve at that single point only. Else it may be a tangent line or some other traversal.

How to understand slope as $\tan \theta$ in such cases? Where am I going wrong?

hanugm
  • 3,571
  • 3
  • 18
  • 50

1 Answers1

0

Slope is not defined like this. You are confusing slope with angle. The slope definition will be more natural, as seen below.

Intuitively, the slope of a curve at a point is directly how steeply it changes as a function of change of horizontal distance: a flat road doesn't change in height as you drive on it, but driving on a ramp constantly elevates you as you drive up it. Typically this is understood as a ratio, so a 45 degree angle has a slope of 1, and a slope of 30 degrees has a slope of $\sqrt3$. This will be formalized in trigonometry, but if I recall correctly, the definition can go the other direction as well (that the ratio approach defines the angle).

We use ratios measurements of slopes instead of degree measurements of angles because calculating the derivative using the limit definition gives the slope as a ratio directly. This graphic shows the idea rather nicely, but this article provides a clearer exposition.

There's also the case that the slope as a ratio makes more sense since the backpropagation technique is effectively adding/subtracting $ε$ to some $x$ given known $\Delta f(x)$, the error signal, relying on the equation you posted above: $f(+)≈()+′()$.

We can rewrite the equation above to obtain $() - f(+) ≈ - ′() $. Since $\Delta f(x)$ and ′() are known, we can also compute $$, but note that only the righthand expression of $$ is computed since $() - f(+)$ is obtained as $Δ()$. We can resolve this with the update rule $x \rightarrow x - \epsilon$

k.c. sayz 'k.c sayz'
  • 2,061
  • 10
  • 26