Highest Voted 'semi-mdp' Questions - Artificial Intelligence Stack Exchange

7

votes

1 answer

What are options in reinforcement learning?

According to a lecture (week 10) about Reinforcement Learning [1], the concept of an option allows searching the state space of an agent much faster. The lecture was hard to follow because many new terms were introduced in a short time. For me, the…

asked Jul 07 '19 at 22:16

user11571

5

votes

1 answer

Should I model my problem as a semi-MDP?

I have a system (like a bank) that people (customers) are entered into the systems by a Poisson process, so the time between the arrival of people (two consecutive customers) will be a random variable. The state of the problem is related to just the…

reinforcement-learning markov-decision-process semi-mdp hierarchical-rl

asked Feb 17 '19 at 12:29

Amin

471
2
11

4

votes

1 answer

How to apply or extend the $Q(\lambda)$ algorithm to semi-MDPs?

I want to model an SMDP such that time is discretized and the transition time between the two states follows an exponential distribution and there would be no reward between the transition. Can I know what are the differences between $Q(\lambda)$…

reinforcement-learning q-learning semi-mdp eligibility-traces

asked Mar 10 '19 at 20:54

Amin

471
2
11

3

votes

2 answers

How to model a multi-agent reinforcement learning problem where actions of different agents can take different durations?

I am confused on a conceptual scale how I would be able to model a multi-agent reinforcement learning problem when each agent performing an action would take different durations to complete the action. This means that a certain action is performed…

reinforcement-learning reference-request proximal-policy-optimization semi-mdp multi-agent-rl

asked Apr 24 '22 at 18:09

hridayns

223
2
12

3

votes

0 answers

Relationship between the reward rate and the sampled reward in a Semi-Markov Decision Process

In the paper: Reinforcement learning methods for continuous-time Markov decision problems, the authors provide the following update rule for the Q-learning algorithm, when applied to Semi-Markov Decision Processes (SMDPs): $Q^{(k+1)}(x,a) =…

reinforcement-learning q-learning markov-decision-process semi-mdp

asked Apr 16 '20 at 13:16

user5093249

722
4
8

2

votes

1 answer

Is my understanding of the differences between MDP, Semi MDP and POMDP correct?

I just wanted to confirm that my understanding of the different Markov Decision Processes are correct, because they are the fundamentals of reinforcement learning. Also, I read a few literature sources, and some are not consistent with each other.…

reinforcement-learning comparison markov-decision-process pomdp semi-mdp

asked Oct 29 '18 at 21:33

Rui Nian

423
3
13

2

votes

1 answer

Updating action-value functions in Semi-Markov Decision Process and Reinforcement Learning

Suppose that the transition time between two states is a random variable (for example, unknown exponential distribution); and between two arrivals, there is no reward. If $\tau$ (real number not an integer number) shows the time between two…

reinforcement-learning q-learning markov-decision-process discount-factor semi-mdp

asked Jun 21 '20 at 07:02

Amin

471
2
11

1

vote

1 answer

Bellman optimality equation in semi Markov decision process

I wrote a Python program for a simple inventory control problem where decision epochs are equally divided (every morning) and there is no lead time for orders (the time between submitting an order until receiving the order). I use the Bellman…

reinforcement-learning markov-decision-process dynamic-programming semi-mdp

asked Sep 01 '20 at 18:46

mnbobo

13
4

Questions tagged [semi-mdp]