For questions that involve the MNIST dataset, but note that questions asking for datasets are off-topic here.
Questions tagged [mnist]
20 questions
7
votes
2 answers
How can a neural network distinguish a rotated 6 and 9 digits?
Rotated MNIST is a popular dataset for benchmarking models equivariant to rotations on $\mathbb{R}^2$, described by $SO(2)$ group or its discrete subgroups like $\mathbb{Z}^{n}$:
Group equivariant convolutional networks
Harmonic networks
It…

spiridon_the_sun_rotator
- 2,454
- 8
- 16
3
votes
2 answers
Implementing a GAN with control over the output class
I am trying to accomplish the reverse of the typical MNIST in machine learning using a GAN - instead of predicting a number from an image of a digit, I want to reconstruct an image of a digit from a number. The traditional GAN, however, isn't…

JS4137
- 143
- 4
3
votes
1 answer
How can I make an MNIST digit recognizer that rejects out-of-distribution data?
I've done an MNIST digit recognition neural network.
When you put images in that are completely unlike its training data, it still tries to classify them as digits. Sometimes it strongly classifies nonsense data as being a specific digit.
I am…

river
- 133
- 6
3
votes
1 answer
Can neural networks learn noise?
I'm interested in the following graphs. A neural network was trained to recognise digits from the MNIST dataset and then the labels were randomly shuffled and the following behaviour was observed. How can this behaviour be explained?
What explains…

Featherball
- 131
- 2
2
votes
1 answer
Are the "artifacts" in select Keras MNIST training images really there or is my download corrupt?
I'm having fun with a ludicrously well known and used dataset: mnist.
I am doing it with a huge and well known tool: keras.
Please excuse the red dots, something else I was doing. I have otherwise not modified the image at all except via the…

EngrStudent
- 361
- 3
- 12
2
votes
3 answers
Why does MNIST provide only a training and a test set and not a validation set as well?
I was taught that, usually, a dataset has to be divided into three parts:
Training set - for learning purposes
Validation set - for picking the model which minimize the loss on this set
Test test - for testing the performance of the model picked…
tail
- 147
- 6
2
votes
1 answer
Why do we subtract logsumexp from the outputs of this neural network?
I'm trying to understand this tutorial for Jax.
Here's an excerpt. It's for a neural net that is designed to classify MNIST images:
from jax.scipy.special import logsumexp
def relu(x):
return jnp.maximum(0, x)
def predict(params, image):
#…
Foobar
- 151
- 5
2
votes
0 answers
Statistical method for selecting features for classification
I'm working on a classifier for the famous MNIST handwritten data set.
I want to create a few features on my own, and I want to be able to estimate which feature might perform better before actually training the classifier. Lets say that I create…
IsolatedSushi
- 21
- 1
1
vote
3 answers
Can I use 4 neurons for output layer to classify hand written digit?
Hello world of ANN usually uses MNIST hand written digit data. For classes there are 10, therefore it takes 10 neurons in the output layer, each class is 0 to 9 handwritten digit images.
If in the end there is only one active neuron in the output…
Muhammad Ikhwan Perwira
- 169
- 6
1
vote
1 answer
How can I use my neural network model, trained on MNIST database, on "real word" digits such as my handwritten digits?
I have developed a feed-forward ANN from scratch trained (and evaluated) on MNIST database, which contains 60,000 + 10,000 handwritten digits samples.
Can I test my model on other digits, for example I write the digit 7 on a paper with my pen and…
tail
- 147
- 6
1
vote
0 answers
Why does the VAE using a KL-divergence with a non-standard mean does not produce good images?
I know I can make a VAE do generation with a mean of 0 and std-dev of 1.
I tested it with the following loss function:
def loss(self, data, reconst, mu, sig):
rl = self.reconLoss(reconst, data)
#dl = self.divergenceLoss(mu, sig)
std =…
axon
- 53
- 5
0
votes
0 answers
Reproducing Knowledge Distillation on MNIST data
I'm trying to implement Knowledge Distillation, specifically to reproduce the MNIST example given in the paper. My (PyTorch) implementation can be found here.
I would expect it is very self-evident that using this method indeed improves results…
Maverick Meerkat
- 412
- 3
- 11
0
votes
0 answers
Adding MNIST images by using them as channel inputs
I'm trying to create a generative neural network that can offer "basic sum" mathematical solutions using the MNIST dataset from a conditional input.
I've curated a dataset of MNIST examples ranging from 0 to 3, and arbitrarily combined them to…
Zintho
- 1
- 1
0
votes
1 answer
Training and validation loss are almost the same (perfect fit?)
I am developing an ANN from scratch which classifies MNIST digits.
These are the curves I get using only one hidden layer composed of 100 neurons activated by ReLU function. The output's neurons are activated by the softmax function:
Is it correct…
tail
- 147
- 6
0
votes
1 answer
How can a convnet learn with a 3x3 output layer?
I was studying the "Deep Learning with Python" book, I came across this MNIST example and this is how the last conv2d layer looks like:
_________________________________________________________________
conv2d_2 (Conv2D) (None, 3, 3, 64) …
Abdullah Akçam
- 101
- 1