Highest Voted 'cross-entropy' Questions - Artificial Intelligence Stack Exchange

19

votes

1 answer

Why has the cross-entropy become the classification standard loss function and not Kullback-Leibler divergence?

The cross-entropy is identical to the KL divergence plus the entropy of the target distribution. The KL divergence equals zero when the two distributions are the same, which seems more intuitive to me than the entropy of the target distribution,…

asked Mar 30 '17 at 08:39

Josh Albert

301
2
6

10

votes

2 answers

How do I handle negative rewards in policy gradients with the cross-entropy loss function?

I am using policy gradients in my reinforcement learning algorithm, and occasionally my environment provides a severe penalty (i.e. negative reward) when a wrong move is made. I'm using a neural network with stochastic gradient descent to learn the…

reinforcement-learning policy-gradients rewards cross-entropy stochastic-gradient-descent

asked Nov 29 '16 at 06:10

jstaker7

209
1
2
5

7

votes

1 answer

How is division by zero avoided when implementing back-propagation for a neural network with sigmoid at the output neuron?

I am building a neural network for which I am using the sigmoid function as the activation function for the single output neuron at the end. Since the sigmoid function is known to take any number and return a value between 0 and 1, this is causing…

neural-networks backpropagation cross-entropy sigmoid numerical-algorithms

asked Jun 02 '18 at 01:21

Dimitry

73
1
3

7

votes

1 answer

Which loss function should I use in REINFORCE, and what are the labels?

I understand that this is the update for the parameters of a policy in REINFORCE: $$ \Delta \theta_{t}=\alpha \nabla_{\theta} \log \pi_{\theta}\left(a_{t} \mid s_{t}\right) v_{t}, $$ where $v_t$ is usually the discounted future reward and …

reinforcement-learning backpropagation policy-gradients reinforce cross-entropy

asked Sep 16 '20 at 15:09

S2673

560
4
16

5

votes

2 answers

What is the advantage of using cross entropy loss & softmax?

I am trying to do the standard MNIST dataset image recognition test with a standard feed forward NN, but my network failed pretty badly. Now I have debugged it quite a lot and found & fixed some errors, but I had a few more ideas. For one, I am…

neural-networks gradient-descent cross-entropy mean-squared-error softmax

asked Oct 11 '20 at 00:13

Ben

425
3
10

4

votes

3 answers

In logistic regression, why is the binary cross-entropy loss function convex?

I am studying logistic regression for binary classification. The loss function used is cross-entropy. For a given input $x$, if our model outputs $\hat{y}$ instead of $y$, the loss is given by $$\text{L}_{\text{CE}}(y,\hat{y}) = -[y \log \hat{y} +…

objective-functions optimization gradient-descent binary-classification cross-entropy

asked Jun 16 '21 at 23:06

hanugm

3,571
3
18
50

4

votes

1 answer

How to formalize learning in terms of information theory?

Consider the following game on a MNIST dataset: There are 60000 images. You can pick any 1000 images and train your Neural Network without access to the rest of images. Your final result is prediction accuracy on all dataset. How to formalize…

deep-learning convolutional-neural-networks cross-entropy

asked Jan 06 '20 at 10:40

Oleg Dats

141
2

4

votes

1 answer

Why does the binary cross-entropy work better than categorical cross-entropy in a multi-class single label problem?

I was just doing a simple NN example with the fashion MNIST dataset, where I was getting 97% accuracy, when I noticed that I was using Binary cross-entropy instead of categorical cross-entropy by accident. When I switched to categorical…

neural-networks classification objective-functions cross-entropy

asked Nov 09 '19 at 21:30

joão correia

41
3

3

votes

0 answers

How do I implement the cross-entropy-method for a RL environment with a continuous action space?

I found many tutorials and posts on how to solve RL environments with discrete action spaces using the cross entropy method (e.g., in this blog post for the OpenAI Gym frozen lake environment). However now I have built my first custom environment,…

neural-networks reinforcement-learning cross-entropy

asked Mar 30 '21 at 17:48

Philipp

143
4

3

votes

2 answers

Where is the mistake in my derivation of the GAN loss function?

I was pondering on the loss function of GAN, and the following thing turned out \begin{aligned} L(D, G) & = \mathbb{E}_{x \sim p_{r}(x)} [\log D(x)] + \mathbb{E}_{x \sim p_g(x)} [\log(1 - D(x)] \\ & = \int_x \bigg( p_{r}(x) \log(D(x)) + p_g (x)…

deep-learning objective-functions generative-adversarial-networks generative-model cross-entropy

asked Dec 31 '20 at 21:22

Enes

304
3
11

3

votes

0 answers

Is maximum likelihood estimation meaningless for a dataset of only outliers?

From my understanding, maximum likelihood estimation chooses the set of parameters for the estimator that maximizes likelihood with the ground truth distribution. I always interpreted it as the training set having a tendency to have most examples…

machine-learning math statistical-ai cross-entropy maximum-likelihood

asked Apr 04 '20 at 05:12

ashenoy

1,409
4
18

2

votes

2 answers

Why do non-linear activation functions that produce values larger than 1 or smaller than 0 work?

Why do non-linear activation functions that produce values larger than 1 or smaller than 0 work? My understanding is that neurons can only produce values between 0 and 1, and that this assumption can be used in things like cross-entropy. Are my…

neural-networks reference-request activation-functions relu cross-entropy

asked Jan 09 '18 at 14:54

Emil Wormbs

59
3

2

votes

1 answer

How does the implementation of the VAE's objective function equate to ELBO?

For a lot of VAE implementations I've seen in code, it's not really obvious to me how it equates to ELBO. $$L(X)=H(Q)-H(Q:P(X,Z))=\sum_ZQ(Z)logP(Z,X)-\sum_ZQ(Z)log(Q(Z))$$ The above is the definition of ELBO, where $X$ is some input, $Z$ is a latent…

implementation variational-autoencoder cross-entropy evidence-lower-bound categorical-crossentropy

asked Nov 12 '20 at 15:37

user8714896

717
1
4
21

2

votes

1 answer

How do you manage negative rewards in policy gradients?

This old question has no definitive answer yet, that's why I am asking it here again. I also asked this same question here. If I'm doing policy gradient in Keras, using a loss of the form: rewards*cross_entropy(action_pdf,…

reinforcement-learning keras policy-gradients rewards cross-entropy

asked May 18 '20 at 15:37

Mastiff

121
3

2

votes

1 answer

How are weights for weighted x-entropy loss on imbalanced data calculated?

I am trying to build a classifier which should be trained with the cross entropy loss. The training data is highly class-imbalanced. To tackle this, I've gone through the advice of the tensorflow docs and now I am using a weighted cross entropy loss…

machine-learning classification datasets weights cross-entropy

asked Apr 14 '20 at 10:28

jmatin

21
2

Questions tagged [cross-entropy]