Highest Voted 'episodic-tasks' Questions - Artificial Intelligence Stack Exchange

3

votes

2 answers

In the on-policy state distribution for episodic tasks, why don't we take into account the length of the episode?

In Sutton & Barto's "Reinforcement Learning: An Introduction", 2nd edition, page 199, they describe the on-policy distribution for episodic tasks in the following box: I don't understand how this can be done without taking the length of the episode…

asked Nov 19 '19 at 03:32

user118967

208
1
8

2

votes

4 answers

How can the Cart Pole problem be a continuing task?

In Introduction to Reinforcement Learning (2nd edition) by Sutton and Barto, there is an example of the Pole-Balancing problem (Example 3.4). In this example, they write that this problem can be treated as an episodic task or continuing task. I…

reinforcement-learning sutton-barto continuous-tasks episodic-tasks

asked Aug 04 '18 at 03:53

user3595632

175
4

2

votes

1 answer

Is it appropriate to represent 'total failure' as an absorbing state?

My understanding is that, in Markov decision processes, absorbing state are states which can transition only to themselves and that these transitions generate rewards of 0. I know that absorbing states are commonly used to represent goals, so an…

reinforcement-learning markov-decision-process state-spaces transition-model episodic-tasks

asked May 15 '22 at 12:45

K--

121
2

1

vote

1 answer

Is it necessary to have a constant reward in the terminal state?

I have downloaded the grid world project form this link. I have executed the project multiple times using: python gridworld.py -k 20 -a q -r -0.2 -s 90 I have noticed that the reward of the terminal states are changing with time. The grid world at…

reinforcement-learning q-learning reward-functions episodic-tasks

asked Dec 06 '22 at 10:09

AAA

111
3

1

vote

1 answer

PPO: dealing with variable episodic length

I'm dealing with a project that has episodes of variable length raging from just 3 steps to 20 steps. Now, I'm guessing that this may cause problems with GAE, as actions in large episodes will have much larger advantages than actions in smaller…

proximal-policy-optimization discount-factor episodic-tasks

asked Nov 28 '22 at 10:54

Antonis Karvelas

65
5

0

votes

1 answer

Could Softmax Action Selection be useful to solve an episodic task with more than 100000 possible states and 2000 actions?

I am new in the field of RL. I am trying to use tabular methods, Q-Learning for solving a problem that takes a lot of time for computation, so I would like to know if there are more efficient methods for it. Why are tabular methods are not useful in…

reinforcement-learning q-learning function-approximation episodic-tasks

asked May 18 '22 at 14:11

Aquila

33
5

Questions tagged [episodic-tasks]

In the on-policy state distribution for episodic tasks, why don't we take into account the length of the episode?

How can the Cart Pole problem be a continuing task?

Is it appropriate to represent 'total failure' as an absorbing state?

Is it necessary to have a constant reward in the terminal state?

PPO: dealing with variable episodic length

Could Softmax Action Selection be useful to solve an episodic task with more than 100000 possible states and 2000 actions?