Highest Voted 'hyperparameter-optimization' Questions - Artificial Intelligence Stack Exchange

65

votes

4 answers

How to select number of hidden layers and number of memory cells in an LSTM?

I am trying to find some existing research on how to select the number of hidden layers and the size of these of an LSTM-based RNN. Is there an article where this problem is being investigated, i.e., how many memory cells should one use? I assume it…

asked Apr 14 '17 at 13:35

Stephen Johnson

969
2
8
9

33

votes

4 answers

How to find the optimal number of neurons per layer?

When you're writing your algorithm, how do you know how many neurons you need per single layer? Are there any methods for finding the optimal number of them, or is it a rule of thumb?

neural-networks hyperparameter-optimization artificial-neuron hyper-parameters layers

asked Aug 02 '16 at 15:41

kenorb

10,423
3
43
91

24

votes

3 answers

How to choose an activation function for the hidden layers?

I choose the activation function for the output layer depending on the output that I need and the properties of the activation function that I know. For example, I choose the sigmoid function when I'm dealing with probabilities, a ReLU when I'm…

neural-networks deep-learning activation-functions hyperparameter-optimization hyper-parameters

asked Jul 09 '18 at 00:06

gvgramazio

696
2
7
19

18

votes

2 answers

How do I decide the optimal number of layers for a neural network?

How do I decide the optimal number of layers for a neural network (feedforward or recurrent)?

neural-networks recurrent-neural-networks feedforward-neural-networks hyperparameter-optimization hidden-layers

asked Aug 03 '16 at 05:31

v01d

283
2
6

16

votes

1 answer

Will parameter sweeping on one split of data followed by cross validation discover the right hyperparameters?

Let's call our dataset splits train/test/evaluate. We're in a situation where we require months of data. So we prefer to use the evaluation dataset as infrequently as possible to avoid polluting our results. Instead, we do 10 fold cross validation…

machine-learning deep-learning hyperparameter-optimization cross-validation generalization

asked Sep 25 '19 at 05:33

Philipp Cannons

161
6

16

votes

2 answers

How can I automate the choice of the architecture of a neural network for an arbitrary problem?

Assume that I want to solve an issue with a neural network that either I can't fit to existing architectures (perceptron, Konohen, etc) or I'm simply not aware of the existence of those or I'm unable to understand their mechanics and I rely on my…

neural-networks reference-request hyperparameter-optimization architecture neuroevolution

asked Aug 05 '16 at 21:29

Zoltán Schmidt

623
7
14

8

votes

2 answers

Why should the number of neurons in a hidden layer be a power of 2?

I have read somewhere on the web (I lost the reference) that the number of units (or neurons) in a hidden layer should be a power of 2 because it helps the learning algorithm to converge faster. Is this a fact? If it is, why is this true? Does it…

deep-learning optimization hyperparameter-optimization hyper-parameters hidden-layers

asked Feb 22 '18 at 16:56

dsfx3d

205
2
7

7

votes

2 answers

How do we choose the kernel size depending on the problem?

Obviously, finding suitable hyper-parameters for a neural network is a complex task and problem or domain-specific. However, there should be at least some "rules" that hold most times for the size of the filter (or kernel)! In most cases, intuition…

convolutional-neural-networks image-recognition hyperparameter-optimization hyper-parameters filters

asked May 16 '17 at 10:47

daniel451

256
1
4
9

7

votes

3 answers

How to determine the embedding size?

When we are training a neural network, we are going to determine the embedding size to convert the categorical (in NLP, for instance) or continuous (in computer vision or voice) information to hidden vectors (or embeddings), but I wonder if there…

deep-learning hyperparameter-optimization hyper-parameters embeddings

asked Jul 07 '21 at 13:26

Lerner Zhang

877
1
7
19

7

votes

1 answer

How do we decide which membership function to use?

In classical set theory, there are two options for an element. It is either a member of a set or not. But in fuzzy set theory, there are membership functions to define the "rate" of an element being a member of a set. In other words, classical logic…

neural-networks hyperparameter-optimization fuzzy-logic

asked Jan 31 '17 at 16:38

buzzer

89
1
10

7

votes

1 answer

An intuitive explanation of Adagrad, its purpose and its formula

It (Adagrad) adapts the learning rate to the parameters, performing smaller updates (i.e. low learning rates) for parameters associated with frequently occurring features, and larger updates (i.e. high learning rates) for parameters associated…

machine-learning gradient-descent hyperparameter-optimization

asked Aug 16 '19 at 16:51

DaddyMike

113
7

6

votes

1 answer

How should we choose the dimensions of the encoding layer in auto-encoders?

neural-networks autoencoders hyperparameter-optimization variational-autoencoder hyper-parameters

asked Dec 27 '18 at 17:26

Neha soni

101
3

6

votes

2 answers

How to shorten the development time of a neural network?

I am developing an LSTM for sequence tagging. During the development, I do various changes in the system, for example, add new features, change the number of nodes in the hidden layers, etc. After each change, I check the accuracy using…

neural-networks training long-short-term-memory hyperparameter-optimization epochs

asked Aug 02 '17 at 08:51

Erel Segal-Halevi

285
1
5

6

votes

2 answers

When training a CNN, what are the hyperparameters to tune first?

I am training a convolutional neural network for object detection. Apart from the learning rate, what are the other hyperparameters that I should tune? And in what order of importance? Besides, I read that doing a grid search for hyperparameters is…

convolutional-neural-networks training computer-vision optimization hyperparameter-optimization

asked Jan 15 '20 at 09:04

S.E.K.

61
4

6

votes

3 answers

What is a "surrogate model"?

In the following paragraph from the book Automated Machine Learning: Methods, Systems, Challenges (by Frank Hutter et al.) In this section we first give a brief introduction to Bayesian optimization, present alternative surrogate models used in it,…

terminology definitions hyperparameter-optimization bayesian-optimization

asked Nov 16 '19 at 11:50

yousef yegane

163
1
6

Questions tagged [hyperparameter-optimization]