Highest Voted 'softmax' Questions - Artificial Intelligence Stack Exchange

17

votes

2 answers

Are softmax outputs of classifiers true probabilities?

BACKGROUND: The softmax function is the most common choice for an activation function for the last dense layer of a multiclass neural network classifier. The outputs of the softmax function have mathematical properties of probabilities and are--in…

asked Nov 14 '22 at 19:11

Snehal Patel

912
1
1
25

6

votes

2 answers

Why does TensorFlow docs discourage using softmax as activation for the last layer?

The beginner colab example for tensorflow states: Note: It is possible to bake this tf.nn.softmax in as the activation function for the last layer of the network. While this can make the model output more directly interpretable, this approach is…

tensorflow objective-functions softmax

asked Apr 13 '20 at 07:38

galah92

163
5

5

votes

2 answers

What is the advantage of using cross entropy loss & softmax?

I am trying to do the standard MNIST dataset image recognition test with a standard feed forward NN, but my network failed pretty badly. Now I have debugged it quite a lot and found & fixed some errors, but I had a few more ideas. For one, I am…

neural-networks gradient-descent cross-entropy mean-squared-error softmax

asked Oct 11 '20 at 00:13

Ben

425
3
10

5

votes

1 answer

Which paper introduced the term "softmax"?

Nowadays, the softmax function is widely used in deep learning and, specifically, classification with neural networks. However, the origins of this term and function are almost never mentioned anywhere. So, which paper introduced this term?

neural-networks deep-learning classification history softmax

asked Jul 10 '20 at 00:37

nbro

39,006
12
98
176

4

votes

1 answer

Why are policy gradient methods more effective in high-dimensional action spaces?

David Silver argues, in his Reinforcement Learning course, that policy-based reinforcement learning (RL) is more effective than value-based RL in high-dimensional action spaces. He points out that the implicit policy (e.g., $\epsilon$-greedy) in…

policy-gradients value-functions function-approximation softmax value-based-methods

asked Dec 16 '22 at 12:52

Saucy Goat

143
4

2

votes

1 answer

Why do we use the softmax instead of no activation function?

Why do we use the softmax activation function on the last layer? Suppose $i$ is the index that has the highest value (in the case when we don't use softmax at all). If we use softmax and take $i$th value, it would be the highest value because $e$ is…

neural-networks activation-functions softmax multiclass-classification

asked May 07 '21 at 16:04

dato nefaridze

862
6
20

2

votes

2 answers

What do the authors of this paper mean by the bias term in this picture of a neural network implementation?

I am reading a paper implementing a deep deterministic policy gradient algorithm for portfolio management. My question is about a specific neural network implementation they depict in this picture (paper, picture is on page 14). The first three…

convolutional-neural-networks papers algorithmic-bias softmax

asked May 09 '20 at 19:49

Mike

141
4

1

vote

1 answer

Dealing with noise in models with softmax output

I have a device with an accelerometer and gyroscope (6-axis). The device sends live raw telemetry data to the model 40 samples for each input, 6 values per sample (accelerometer xyz, gyroscope xyz). The model predicts between 12 different labels of…

convolutional-neural-networks tensorflow softmax

asked Aug 13 '23 at 18:11

Sterling Duchess

113
3

1

vote

1 answer

Number of units in Final softmax layer in VGGNet16

I am trying to implement and train neural network model VGGNet from scratch, on my own data. I am reproducing all the layers of the model. I am having a confusion about the last, fully connected softmax layer. In the research paper by Simonyan and…

convolutional-neural-networks softmax dense-layers vgg

asked Aug 03 '23 at 17:58

Dawood Ahmad

13
3

1

vote

2 answers

Backpropagation with CrossEntropy and Softmax, HOW?

Let Zs be the input of the output layer (for example, Z1 is the input of the first neuron in the output layer), Os be the output of the output layer (which are actually the results of applying the softmax activation function to Zs, for example, O1 =…

backpropagation cross-entropy softmax one-hot-encoding

asked Jul 11 '23 at 16:52

qazaq

11
2

1

vote

1 answer

Why are SVMs / Softmax classifiers considered linear while neural networks are non-linear?

My understanding is that neural networks are definitely not linear classifiers, as the point of functions like ReLU is to introduce non-linearity. However, here's where my understanding starts to break down. A classifier, like Softmax or SVM is…

neural-networks machine-learning activation-functions support-vector-machine softmax

asked Jun 22 '22 at 22:23

Foobar

151
5

1

vote

1 answer

Trouble writing the backpropagation algorithm in python through crossentropy and softmax

so I am writing my own neural network library for a class project and I got everything working for a simple 2-class test using the distance (L2) cost function. I wanted to get a similar result using softmax and crossentropy instead. I did the…

backpropagation cross-entropy softmax

asked Jun 01 '22 at 10:36

user605734 MBS

121
5

1

vote

0 answers

Use soft-max post-training for a ReLU trained network?

For a project, I've trained multiple networks for multiclass classification all ending with a ReLU activation at the output. Now the output logits are not probabilities. Is it valid to get the probability of each class by applying a softmax function…

neural-networks probability softmax multiclass-classification

asked Nov 22 '21 at 10:37

user452306

21
3

1

vote

1 answer

Is it normal that the values of the LogSoftmax function are very large negative numbers?

I have trained a classification network with PyTorch lightning where my training step looks like below: def training_step(self, batch, batch_idx): x, y = batch y_hat = self(x) loss = F.cross_entropy(y_hat, y) self.log("train_loss",…

deep-learning classification pytorch softmax

asked Sep 09 '21 at 20:31

pd109

125
4

1

vote

1 answer

Is it appropriate to use a softmax activation with a categorical crossentropy loss?

I have a binary classification problem where I have 2 classes. A sample is either class 1 or class 2 - For simplicity, lets say they are exclusive from one another so it is definitely one or the other. For this reason, in my neural network, I have…

binary-classification sigmoid softmax categorical-crossentropy binary-crossentropy

asked Mar 16 '21 at 16:20

user9317212

161
2
10

Questions tagged [softmax]