Highest Voted 'optimizers' Questions - Artificial Intelligence Stack Exchange

9

votes

1 answer

What is the formula for the momentum and Adam optimisers?

In the gradient descent algorithm, the formula to update the weight $w$, which has $g$ as the partial gradient of the loss function with respect to it, is: $$w\ -= r \times g$$ where $r$ is the learning rate. What should be the formula for momentum…

asked Jan 13 '20 at 07:04

Dee

1,283
1
11
35

2

votes

1 answer

How do I use machine learning to create an optimization algorithm?

Let's say that I want to create an optimization algorithm, which is supposed to find an optimum value for a given objective function. Creating an optimization algorithm to explore through the search space can be quite challenging. My question is:…

machine-learning optimization algorithm-request optimizers

asked Nov 29 '21 at 14:51

sherl.lol

23
4

2

votes

1 answer

Joined vs Separate optimizer for Actor-Critic

Say that I have a simple Actor-Critic architecture, (I am not familiar with Tensorflow, but) in Pytorch we need to specify the parameters when defining an optimizer (SGD, Adam, etc) and therefore we can define 2 separate optimizers for the Actor and…

reinforcement-learning deep-rl pytorch actor-critic-methods optimizers

asked Sep 11 '21 at 08:59

Sanyou

165
2
10

2

votes

1 answer

What do we mean by "infrequent features"?

I am reading this blog post: https://ruder.io/optimizing-gradient-descent/index.html. In the section about AdaGrad, it says: It adapts the learning rate to the parameters, performing smaller updates (i.e. low learning rates) for parameters…

deep-learning terminology optimizers

asked Oct 23 '20 at 10:59

ava_punksmash

133
2

2

votes

3 answers

What kind of optimizer is suggested to use for binary classification of similar images?

I have spent some time searching Google and wasn't able to find out what kind of optimization algorithm is best for binary classification when images are similar to one another. I'd like to read some theoretical proofs (if any) to convince myself…

deep-learning classification optimization hyperparameter-optimization optimizers

asked Feb 24 '20 at 08:11

bit_scientist

241
1
4
15

1

vote

1 answer

What is uncentered variance and how it becomes equal to mean square in Adam?

I have been reading about Adam and AdamW (Here). The author mentioned that in "uncentered variance" we don't consider subtracting mean In this statement, the author is talking about uncentered variance and how it becomes equal to the square of the…

neural-networks deep-learning optimization gradient-descent optimizers

asked Oct 20 '21 at 09:00

learner

151
5

1

vote

0 answers

Why does Adam optimizer work slower than Adagrad, Adadelta, and SGD for Neural Collaborative Filtering (NCF)?

I've been working on Neural Collaborative Filtering (NCF) recently to build a recommender system using Tensorflow Recommenders. Doing some hyperparameter tuning with different optimizers available in the module tf.keras.optimizers, I found out that…

neural-networks deep-learning training hyperparameter-optimization optimizers

asked May 06 '21 at 11:12

bkaankuguoglu

111
2

1

vote

1 answer

In the update rule of RMSprop, do we divide by a matrix?

I've been trying to understand RMSprop for a long time, but there's something that keeps eluding me. Here is a screenshot from this video by Andrew Ng. From the element-wise comment, from what I understand, $dW$ and $db$ are matrices, so that must…

deep-learning optimization gradient-descent optimizers

asked Dec 17 '20 at 10:39

Uriyasama

11
1

0

votes

2 answers

When training a DNN on infinite samples, do ADAM or other popular optimization algorithms still work as intended?

When training a DNN on infinite samples, do ADAM or other popular optimization algorithms still work as intended? I have an DNN training from an infinite stream of samples, that most likely won't repeat. So there is no real notion of "epoch". Now I…

machine-learning training adam optimizers

asked Jan 20 '23 at 14:08

dronus

101

Questions tagged [optimizers]