Highest Voted 'convergence' Questions - Artificial Intelligence Stack Exchange

8

votes

2 answers

What is convergence in machine learning?

I came across this answer on Quora, but it was pretty sparse. I'm looking for specific meanings in the context of machine learning, but also mathematical and economic notions of the term in general.

asked Nov 08 '19 at 03:23

DukeZhou

6,237
5
25
53

7

votes

0 answers

Is the Bellman equation that uses sampling weighted by the Q values (instead of max) a contraction?

It is proved that the Bellman update is a contraction (1). Here is the Bellman update that is used for Q-Learning: $$Q_{t+1}(s, a) = Q_{t}(s, a) + \alpha*(r(s, a, s') + \gamma \max_{a^*} (Q_{t}(s', a^*)) - Q_t(s,a)) \tag{1} \label{1}$$ The proof…

reinforcement-learning q-learning proofs convergence bellman-equations

asked Jul 23 '20 at 17:32

sirfroggy

71
3

7

votes

1 answer

Why does reinforcement learning using a non-linear function approximator diverge when using strongly correlated data as input?

While reading the DQN paper, I found that randomly selecting and learning samples reduced divergence in RL using a non-linear function approximator (e.g a neural network). So, why does Reinforcement Learning using a non-linear function approximator…

reinforcement-learning dqn deep-rl convergence function-approximation

asked Jan 29 '20 at 08:47

강문주

71
2

6

votes

1 answer

Deep Q-Learning poor convergence on Stochastic Environment

I'm trying to implement a Deep Q-network in Keras/TF that learns to play Minesweeper (our stochastic environment). I have noticed that the agent learns to play the game pretty well with both small and large board sizes. However, it only…

reinforcement-learning deep-neural-networks keras q-learning convergence

asked Nov 17 '18 at 11:39

Sanavesa

153
1
6

6

votes

1 answer

How to create and train (with mutation and selection) a neural network to predict the next state of a board?

I'm aiming to create a neural network that can learn to predict the next state of a board using the rules of Conway's Game of Life. Technically, I have three questions, but I felt that they needed to be together to get the full picture. My network…

neural-networks genetic-algorithms convergence

asked Aug 04 '17 at 15:15

Aric

275
1
6

6

votes

1 answer

What are the conditions of convergence of temporal-difference learning?

In reinforcement learning, temporal difference seem to update the value function in each new iteration of experience absorbed from the environment. What would be the conditions for temporal-difference learning to converge in the end? How is it…

reinforcement-learning convergence temporal-difference-methods

asked May 22 '20 at 02:23

MJeremy

163
3

6

votes

1 answer

Convergence of semi-gradient TD(0) with non-linear function approximation

I am looking for a result that shows the convergence of semi-gradient TD(0) algorithm with non-linear function approximation for on-policy prediction. Specifically, the update equation is given by (borrowing notation from Sutton and Barto…

reinforcement-learning convergence function-approximation temporal-difference-methods on-policy-methods

asked Nov 05 '19 at 16:48

srinivas tunuguntla

61
2

6

votes

1 answer

How to show temporal difference methods converge to MLE?

In chapter 6 of Sutton and Barto (p. 128), they claim temporal difference converges to the maximum likelihood estimate (MLE). How can this be shown formally?

reinforcement-learning proofs convergence temporal-difference-methods

asked Aug 14 '19 at 16:15

user

203
1
7

5

votes

2 answers

What is curriculum learning in reinforcement learning?

I recently came across the term "curriculum learning" in the context of DRL and was intrigued by its potential to improve the learning process. As such, what is curriculum learning? And how can it be helpful for the convergence of RL algorithms?

reinforcement-learning deep-rl terminology convergence curriculum-learning

asked Apr 29 '23 at 14:38

Robin van Hoorn

1,810
7
32

5

votes

2 answers

How to check whether my loss function is convex or not?

Loss functions are useful in calculating loss and then we can update the weights of a neural network. The loss function is thus useful in training neural networks. Consider the following excerpt from this answer In principle, differentiability is…

objective-functions convergence convex-function

asked Aug 02 '21 at 02:21

hanugm

3,571
3
18
50

5

votes

1 answer

Does the policy iteration convergence hold for finite-horizon MDP?

Most RL books (Sutton & Barto, Bertsekas, etc.) talk about policy iteration for infinite-horizon MDPs. Does the policy iteration convergence hold for finite-horizon MDP? If yes, how can we derive the algorithm?

reinforcement-learning markov-decision-process convergence policy-iteration

asked Apr 14 '21 at 03:37

user529295

359
1
10

5

votes

1 answer

Why does Q-learning converge under 100% exploration rate?

I am working on this assignment where I made the agent learn state-action values (Q-values) with Q-learning and 100% exploration rate. The environment is the classic gridworld as shown in the following picture. Here are the values of my…

reinforcement-learning q-learning convergence epsilon-greedy-policy exploration-strategies

asked Feb 20 '21 at 12:04

Rim Sleimi

215
1
6

5

votes

1 answer

When exactly is a model considered over-parameterized?

When exactly is a model considered over-parameterized? There are some recent researches in Deep Learning about the role of over-parameterization toward generalization, so it would be nice if I can know what exactly can be considered as such. A…

deep-learning definitions convergence generalization

asked Dec 13 '19 at 15:41

Phúc Lê

161
5

5

votes

1 answer

How can we conclude that an optimization algorithm is better than another one

When we test a new optimization algorithm, what the process that we need to do?For example, do we need to run the algorithm several times, and pick a best performance,i.e., in terms of accuracy, f1 score .etc, and do the same for an old optimization…

optimization convergence performance hyperparameter-optimization

asked Sep 22 '19 at 15:46

user29902

51
1

4

votes

1 answer

How can I ensure convergence of DDQN, if the true Q-values for different actions in the same state are very close?

I am applying a Double DQN algorithm to a highly stochastic environment where some of the actions in the agent's action space have very similar "true" Q-values (i.e. the expected future reward from either of these actions in the current state is…

reinforcement-learning value-functions convergence reward-functions double-dqn

asked Oct 24 '18 at 18:29

apitsch

93
9

Questions tagged [convergence]