Highest Voted 'calculus' Questions - Artificial Intelligence Stack Exchange

5

votes

2 answers

Why is the derivative of this objective function 0 if the policy is deterministic?

In the Berkeley RL class CS294-112 Fa18 9/5/18, they mention the following gradient would be 0 if the policy is deterministic. $$ \nabla_{\theta} J(\theta)=E_{\tau \sim \pi_{\theta}(\tau)}\left[\left(\sum_{t=1}^{T} \nabla_{\theta} \log…

asked Sep 06 '18 at 12:44

jonperl

153
7

5

votes

1 answer

Why is the change in cost wrt bias in neural network equal to error in the neuron?

While reading the book on neural networks by Michael Nielson, I had a problem understanding equation (BP3), which is $$ \frac{\partial C}{\partial b_{j}^{l}}=\delta_{j}^{l} \tag{BP3}\label{BP3}, $$ which can be translated to plain English as…

neural-networks deep-learning backpropagation math calculus

asked Apr 21 '17 at 10:51

Madhusoodan P

151
1
4

5

votes

2 answers

Are calculus and differential geometry required for building neural networks?

I've been studying geometry and linear algebra for months with the goal to build neural networks. But now I'm reading that perceptrons require fitting curves, and curves are not expressed as linear functions. So, I might need to study differential…

neural-networks calculus education

asked Aug 06 '21 at 05:13

user456280

171
5

5

votes

2 answers

Which linear algebra book should I read to understand vectorized operations?

I am reading Goodfellow's book about neural networks, but I am stuck in the mathematical calculus of the back-propagation algorithm. I understood the principle, and some Youtube videos explaining this algorithm shown step-by-step, but now I would…

neural-networks reference-request linear-algebra books calculus

asked Oct 28 '19 at 15:52

lolveley

151
3

3

votes

1 answer

Are my computations of the forward and backward pass of a neural network with one input, hidden and output neurons correct?

I have computed the forward and backward passes of the following simple neural network, with one input, hidden, and output neurons. Here are my computations of the forward pass. \begin{align} net_1 &= xw_{1}+b \\ h &= \sigma (net_1) \\ net_2 &=…

neural-networks backpropagation calculus forward-pass

asked Mar 12 '18 at 01:49

Eka

1,036
8
23

3

votes

1 answer

What is the partial derivative $\frac{\partial y}{\partial x_1}$ in this neural network?

The answer is supposed to be -6, but I don't know how to get that. Also, in a NN, is that 2nd hidden layer possible, where the neurons are not dependent on all the neurons of the previous layer?

neural-networks backpropagation calculus

asked Apr 29 '23 at 19:56

duanebobby

33
3

3

votes

2 answers

What is a bad local minimum in machine learning?

What is "bad local minima"? The following papers all mention this expression. Eliminating all bad Local Minima from Loss Landscapes without even adding an Extra Unit limination of All Bad Local Minima in Deep Learning Adding One Neuron Can…

machine-learning deep-learning terminology papers calculus

asked Jan 16 '19 at 02:44

Umang Gupta

200
11

2

votes

0 answers

Best calculus books for Deep Learning

Recommend some calculus books for Deep Learning and neural networks. I know what is integration, differentiation, derivates, limits on a based level. I would like to understand on deep level the calculus behind Deep Learning and neural networks.

neural-networks deep-learning math books calculus

asked Mar 27 '23 at 15:02

Dan Il

21
1

2

votes

1 answer

How is the log-derivative trick of a trajectory derived?

I am looking at this formula which breaks down the gradient of $P(\tau |\theta)$ the first part is clear as is the derivative of $\log(x)$, but I do not see how the first formula is rearranged into the second.

reinforcement-learning math policy-gradients calculus

asked Apr 26 '20 at 21:42

Jacob B

227
2
5

2

votes

0 answers

Is there any wrong in my focal loss derivation?

Assume $\mathbf{X} \in R^{N, C}$ is the input of the softmax $\mathbf{P} \in R^{N, C}$, where $N$ is number of examples and $C$ is number of classes: $$\mathbf{p}_i = \left[ \frac{e^{x_{ik}}}{\sum_{j=1}^C e^{x_{ij}}}\right]_{k=1,2,...C} \in R^{C}…

deep-learning math objective-functions calculus

asked Mar 30 '20 at 03:51

Giang Tran

121
1

2

votes

0 answers

Is Gradient Descent algorithm a part of Calculus of Variations?

As in https://en.wikipedia.org/wiki/Calculus_of_variations The calculus of variations is a field of mathematical analysis that uses variations, which are small changes in functions and functionals, to find maxima and minima of functionals The…

machine-learning gradient-descent variational-autoencoder calculus math

asked Feb 04 '20 at 08:00

Dee

1,283
1
11
35

1

vote

1 answer

How do policy gradients work?

If I understand it correctly from the following equation $$U(\theta)=\mathbb{E}_{\tau \sim P(\tau;\theta)}\left [ \sum_{t=0}^{H-1}R(s_t,u_t);\pi_{\theta} \right ]=\sum_{\tau}P(\tau;\theta)R(\tau)$$ from this paper, the utility of a policy…

reinforcement-learning backpropagation policy-gradients calculus

asked Apr 25 '23 at 01:13

User

165
4

1

vote

1 answer

What all does the gradient tells us other than the direction to move parameters?

Gradients are used in optimization algorithms. I know that a gradient gives us information about the direction in which one needs to update the weights of a neural network. We need to travel in the opposite direction of the gradient to get optimal…

optimization gradient calculus

asked Aug 19 '21 at 01:34

hanugm

3,571
3
18
50

1

vote

0 answers

BlackOut - ICLR 2016: need help understanding the cost function derivative

In the ICLR 2016 paper BlackOut: Speeding up Recurrent Neural Network Language Models with very Large Vocabularies, on page 3, for eq. 4: $$ J_{ml}^s(\theta) = log \ p_{\theta}(w_i | s) $$ They have shown the gradient computation in the subsequent…

papers objective-functions calculus derivative

asked Feb 12 '21 at 16:21

anurag

151
1
7

1

vote

0 answers

For the generalised delta rule in back-propogation, do you subtract the target from the obtained output, or vice versa?

When I look up the generalised delta rule equation for back-propogation, I am seeing two conflicting equations. For example, here (slide 20), given $o$ (the output, defined in slide 18), $z$ (the activated output) and a target $t$, defined in slide…

neural-networks backpropagation calculus

asked Jan 04 '21 at 12:27

Slowat_Kela

287
2
9

Questions tagged [calculus]