Highest Voted 'notation' Questions - Artificial Intelligence Stack Exchange

11

votes

1 answer

What is the meaning of $V(D,G)$ in the GAN objective function?

Here is the GAN objective function. $$\min _{G} \max _{D} V(D, G)=\mathbb{E}_{\boldsymbol{x} \sim p_{\text {data }}(\boldsymbol{x})}[\log D(\boldsymbol{x})]+\mathbb{E}_{\boldsymbol{z} \sim p_{\boldsymbol{z}}(\boldsymbol{z})}[\log…

asked Apr 12 '19 at 20:53

i_rezic

245
1
6

8

votes

1 answer

How is the policy gradient calculated in REINFORCE?

Reading Sutton and Barto, I see the following in describing policy gradients: How is the gradient calculated with respect to an action (taken at time t)? I've read implementations of the algorithm, but conceptually I'm not sure I understand how the…

reinforcement-learning policy-gradients sutton-barto notation reinforce

asked Apr 21 '19 at 19:23

Hanzy

499
3
10

6

votes

2 answers

How are the reward functions $R(s)$, $R(s, a)$ and $R(s, a, s')$ equivalent?

In this video, the lecturer states that $R(s)$, $R(s, a)$ and $R(s, a, s')$ are equivalent representations of the reward function. Intuitively, this is the case, according to the same lecturer, because $s$ can be made to represent the state and the…

reinforcement-learning markov-decision-process proofs notation reward-functions

asked Feb 07 '19 at 15:38

nbro

39,006
12
98
176

5

votes

1 answer

What does the argmax of the expectation of the log likelihood mean?

What does the following equation mean? What does each part of the formula represent or mean? $$\theta^* = \underset {\theta}{\arg \max} \Bbb E_{x \sim p_{data}} \log {p_{model}(x|\theta) }$$

machine-learning math probability notation expectation

asked Jan 28 '18 at 11:15

arash moradi

181
1
6

5

votes

1 answer

What does the notation $\mathcal{N}(z; \mu, \sigma)$ stand for in statistics?

I know that the notation $\mathcal{N}(\mu, \sigma)$ stands for a normal distribution. But I'm reading the book "An Introduction to Variational Autoencoders" and in it, there is this notation: $$\mathcal{N}(z; 0, I)$$ What does it mean? picture of…

terminology variational-autoencoder notation random-variable bayesian-statistics

asked Aug 23 '20 at 17:49

Peyman

534
3
10

5

votes

1 answer

What is the meaning of the square brackets in ant colony optimization?

I'm studying the paper "Minimizing Total Tardiness on a Single Machine Using Ant Colony Optimization" which has proposed to use Ant colony optimization to SMTWTP. According to this paper: Each artificial ant iteratively and independently decides…

papers swarm-intelligence notation ant-colony-optimization

asked Nov 01 '19 at 12:59

Pablo

273
1
5

5

votes

1 answer

Understanding the equation of TD(0) in the paper "Learning to predict by the methods of temporal differences"

In the paper Learning to predict by the methods of temporal differences (p. 15), the weights in the temporal difference learning are updated as given by the equation $$ \Delta w_t = \alpha \left(P_{t+1} - P_t\right) \sum_{k=1}^{t}{\lambda^{t-k}…

reinforcement-learning temporal-difference-methods notation

asked Jun 01 '19 at 14:41

Amanda

205
1
5

4

votes

1 answer

Is the Bandit Problem an MDP?

I've read Sutton and Barto's introductory RL book. They define a policy as a mapping from states to probabilities of selecting each possible action. If the agent is following policy $\pi$ at time $t$, then $\pi(a|s)$ as the probability of taking…

reinforcement-learning comparison markov-decision-process notation multi-armed-bandits

asked Jun 15 '21 at 08:44

Snowball

213
1
6

4

votes

2 answers

Why do we use $X_{I_t,t}$ and $v_{I_t}$ to denote the reward received and the at time step $t$ and the distribution of the chosen arm $I_t$?

I'm doing some introductory research on classical (stochastic) MABs. However, I'm a little confused about the common notation (e.g. in the popular paper of Auer (2002) or Bubeck and Cesa-Bianchi (2012)). As in the latter study, let us consider an…

papers notation multi-armed-bandits upper-confidence-bound

asked Jul 16 '20 at 13:41

MAB_N00B

41
3

4

votes

1 answer

What does the term $|\mathcal{A}(s)|$ mean in the $\epsilon$-greedy policy?

I've been looking online for a while for a source that explains these computations but I can't find anywhere what does the $|A(s)|$ mean. I guess $A$ is the action set but I'm not sure about that notation: $$\frac{\varepsilon}{|\mathcal{A}(s)|}…

reinforcement-learning monte-carlo-methods notation on-policy-methods epsilon-greedy-policy

asked Jul 14 '20 at 20:11

Metrician

95
5

4

votes

2 answers

Why are the value functions sometimes written with capital letters and other times with lower-case letters?

Why are the state-value and action-value functions are sometimes written in small letters and other times in capitals? For instance, why in the Q-learning algorithm (page 131 of Barto and Sutton's book but not only), we the capitals are used $Q(S,…

reinforcement-learning value-functions notation

asked Jun 10 '20 at 02:46

d56

223
1
7

4

votes

1 answer

What do the subscripts mean in $N_{t,n,\sigma,L}$?

A neural network can apparently be denoted as $N_{t,n,\sigma,L}$. What do these subscripts $t, n, \sigma$ and $L$ mean? Could you link me to a paper, article or webpage with an explanation for this?

neural-networks math definitions notation

asked Nov 13 '19 at 08:03

J. Doe

143
5

4

votes

1 answer

Being confused of distribution notations in Deep Learning book

In chapter 5 of Deep Learning book of Ian Goodfellow, some notations in the loss function as below make me really confused. I tried to understand $x,y \sim p_{data}$ means a sample $(x, y)$ sampled from original dataset distribution (or $y$ is the…

machine-learning deep-learning notation

asked May 25 '19 at 12:02

David Ng

143
4

3

votes

1 answer

What does the notation $\nabla_\theta \mathcal{L}$ mean?

Here's the general algorithm of maximum entropy inverse reinforcement learning. This uses a gradient descent algorithm. The point that I do not understand is there is only a single gradient value $\nabla_\theta \mathcal{L}$, and it is used to…

machine-learning reinforcement-learning gradient-descent notation

asked Jul 06 '18 at 21:26

İbrahim Abbasov

61
1

3

votes

1 answer

How to interpret the policy notation $\pi_{\theta}(a_{t}|s_{t})$ in Reinforcement Learning?

In the context of Reinforcement Learning, I have seen that the policy $\pi$ (for some algorithms) is nothing but a Neural Network architecture (for example a Feedforward Neural Network). This policy is usually annotated as $\pi_{\theta}$, suggesting…

reinforcement-learning deep-rl notation

asked May 24 '23 at 17:36

moth123

31
2

Questions tagged [notation]