Questions tagged [keras]

For questions related to Keras, the modular neural networks library written in Python. However, note that programming questions are off-topic here.

See: Keras Documentation

251 questions
37
votes
6 answers

Why do CNN's sometimes make highly confident mistakes, and how can one combat this problem?

I trained a simple CNN on the MNIST database of handwritten digits to 99% accuracy. I'm feeding in a bunch of handwritten digits, and non-digits from a document. I want the CNN to report errors, so I set a threshold of 90% certainty below which my…
22
votes
2 answers

Why would you implement the position-wise feed-forward network of the transformer with convolution layers?

The Transformer model introduced in "Attention is all you need" by Vaswani et al. incorporates a so-called position-wise feed-forward network (FFN): In addition to attention sub-layers, each of the layers in our encoder and decoder contains a…
8
votes
2 answers

Can LSTM neural networks be sped up by a GPU?

I am training LSTM neural networks with Keras on a small mobile GPU. The speed on the GPU is slower than on the CPU. I found some articles that say that it is hard to train LSTMs (and, in general, RNNs) on GPUs because the training cannot be…
Dieshe
  • 279
  • 1
  • 2
  • 6
8
votes
1 answer

Validation accuracy higher than training accurarcy

I implemented the unet in TensorFlow for the segmentation of MRI images of the thigh. I noticed I always get a higher validation accuracy by a small gap, independently of the initial split. One example: So I researched when this could be…
Lis Louise
  • 139
  • 3
8
votes
2 answers

Effect of batch size and number of GPUs on model accuracy

I have a data set that was split using a fixed random seed and I am going to use 80% of the data for training and the rest for validation. Here are my GPU and batch size configurations use 64 batch size with one GTX 1080Ti use 128 batch size with…
bit_scientist
  • 241
  • 1
  • 4
  • 15
7
votes
1 answer

Why does 'loss' change depending on the number of epochs chosen?

I am using Keras to train different NN. I would like to know why if I increment the epochs in 1, the result until the new epoch is not the same. I am using shuffle=False, and np.random.seed(2017), and I have check that if I repeat with the same…
6
votes
1 answer

Deep Q-Learning poor convergence on Stochastic Environment

I'm trying to implement a Deep Q-network in Keras/TF that learns to play Minesweeper (our stochastic environment). I have noticed that the agent learns to play the game pretty well with both small and large board sizes. However, it only…
6
votes
2 answers

Two data classes for a convolutional neural network, can one have a LOT more images for training than the other?

I have two classes in the training set: one that has images with a feature and the other of images without that feature. Can there be a LOT more images with "no feature" so I can fit in all possible false positives?
Vasya T
  • 69
  • 1
6
votes
1 answer

It is possible to use deep learning to give approximate solutions to NP-hard graph theory problems?

It is possible to use deep learning to give approximate solutions to NP-hard graph theory problems? If we take, for example, the travelling salesman problem (or the dominating set problem). Let's say I have a bunch of smaller examples, where I…
6
votes
1 answer

How to graphically represent a RNN architecture implemented in Keras?

I'm trying to create a simple blogpost on RNNs, that should give a better insight into how they work in Keras. Let's say: model = keras.models.Sequential() model.add(keras.layers.SimpleRNN(5, return_sequences=True, input_shape=[None,…
6
votes
3 answers

Why are traditional ML models still used over deep neural networks?

I'm still on my first steps in the Data Science field. I played with some DL frameworks, like TensorFlow (pure) and Keras (on top) before, and know a little bit of some "classic machine learning" algorithms like decision trees, k-nearest neighbors,…
Douglas Ferreira
  • 845
  • 1
  • 8
  • 13
5
votes
1 answer

Over- and underestimations of the lowest and highest values in LSTM network

I'm training an LSTM network with multiple inputs and several LSTM layers in order to set up a time series gap filling procedure. The LSTM is trained bidirectionally with "tanh" activation on the outputs of the LSTM, and one Dense layer with…
5
votes
1 answer

How does backpropagation work on a custom loss function whose components have magnitudes of different orders?

I want to use a custom loss function which is a weighted combination of l1 and DSSIM losses. The DSSIM loss is limited between 0 and 0.5 where as the l1 loss can be orders of magnitude greater and is so in my case. How does backpropagation work in…
5
votes
1 answer

How to add a dense layer after a 2d convolutional layer in a convolutional autoencoder?

I am trying to implement a convolutional autoencoder with a dense layer at the bottleneck to do some dimensional reduction. I have seen two approaches for this, which aren't particularly scalable. The first was to introduce 2 dense layers (one at…
5
votes
1 answer

How do I combine models trained on different data to increase classification accuracy?

I have two trained models. One is using a LinearSVC algorithm and is trained on numerical data from medical examination from patients with diabetic retinopathy. The second one is a neural network trained on images of retina scans from patients with…
1
2 3
16 17