Highest Voted 'weights' Questions - Artificial Intelligence Stack Exchange

66

votes

12 answers

In a CNN, does each new filter have different weights for each input channel, or are the same weights of each filter used across input channels?

My understanding is that the convolutional layer of a convolutional neural network has four dimensions: input_channels, filter_height, filter_width, number_of_filters. Furthermore, it is my understanding that each new filter just gets convoluted…

convolutional-neural-networks weights filters

asked Mar 22 '18 at 02:36

Ryan Chase

793
1
6
6

16

votes

5 answers

Why are the initial weights of neural networks randomly initialised?

This might sound silly to someone who has plenty of experience with neural networks but it bothers me... Random initial weights might give you better results that would be somewhat closer to what a trained neural network should look like, but it…

neural-networks training weights weights-initialization

asked Oct 21 '17 at 06:52

Matas Vaitkevicius

271
5
12

7

votes

1 answer

Is there a proper initialization technique for the weight matrices in multi-head attention?

Self-attention layers have 4 learnable tensors (in the vanilla formulation): Query matrix $W_Q$ Key matrix $W_K$ Value matrix $W_V$ Output matrix $W_O$ Nice illustration from https://jalammar.github.io/illustrated-transformer/ However, I do not…

transformer attention weights weights-initialization

asked Sep 01 '21 at 08:41

spiridon_the_sun_rotator

2,454
8
16

7

votes

0 answers

Why is there a Uniform and Normal version of He / Xavier initialization in DL libraries?

Two of the most popular initialization schemes for neural network weights today are Xavier and He. Both methods propose random weight initialization with a variance dependent on the number of input and output units. Xavier proposes $$W \sim…

neural-networks training weights weights-initialization

asked Dec 17 '20 at 23:15

Tinu

618
1
4
12

6

votes

2 answers

What is the goal of weight initialization in neural networks?

This is a simple question. I know the weights in a neural network can be initialized in many different ways like: random uniform distribution, normal distribution, and Xavier initialization. But what is the weight initialization trying to…

neural-networks machine-learning gradient-descent weights

asked Sep 21 '20 at 23:10

S2673

560
4
16

6

votes

2 answers

Can neurons in MLP and filters in CNN be compared?

I know they are not the same in working, but an input layer sends the input to $n$ neurons with a set of weights, based on these weights and the activation layer, it produces an output that can be fed to the next layer. Aren't the filters the same,…

convolutional-neural-networks multilayer-perceptrons artificial-neuron weights filters

asked Mar 24 '20 at 23:28

Tibo Geysen

193
5

5

votes

2 answers

What do the neural network's weights represent conceptually?

I understand how neural networks work and have studied their theory well. My question is: On the whole, is there a clear understanding of how mutation occurs within a neural network from the input layer to the output layer, for both supervised and…

neural-networks deep-learning deep-neural-networks weights explainable-ai

asked Jun 16 '18 at 21:03

user248884

151
3

5

votes

1 answer

Why did the developement of neural networks stop between 50s and 80s?

In a video lecture on the development of neural networks and the history of deep learning (you can start from minute 13), the lecturer (Yann LeCunn) said that the development of neural networks stopped until the 80s because people were using the…

neural-networks backpropagation history weights perceptron

asked Feb 24 '21 at 19:28

Daviiid

563
3
15

4

votes

1 answer

Do we know what the units of neural networks will do before we train them?

I was learning about back-propagation and, looking at the algorithm, there is no particular 'partiality' given to any unit. What I mean by partiality there is that you have no particular characteristic associated with any unit, and this results in…

neural-networks backpropagation artificial-neuron weights weights-initialization

asked Jul 05 '18 at 05:25

Htnamus

43
6

4

votes

1 answer

In TD(0) with linear function approximation, why is the gradient of $\hat v(S^{\prime}, \mathbf w)$ wrt parameters $\mathbf w$ not considered?

I am reading these slides. On page 38, the update for the parameters for the linear function approximation of TD(0) is given. I have a doubt regarding this. The cost function (RMSE) is given on page 37. My doubt is: why is the gradient of $\hat…

reinforcement-learning value-functions function-approximation weights temporal-difference-methods

asked Nov 14 '21 at 10:17

A Yoghes

43
4

4

votes

0 answers

Why does sigmoid saturation prevent signal flow through the neuron?

As per these slides on page 35: Sigmoids saturate and kill gradients. when the neuron's activation saturates at either tail of 0 or 1, the gradient at these regions is almost zero. the gradient and almost no signal will flow through the neuron…

neural-networks backpropagation weights sigmoid vanishing-gradient-problem

asked Jan 31 '21 at 20:56

EEAH

193
1
5

4

votes

0 answers

When is using weight regularization bad?

Regularization of weights (e.g. L1 or L2) keeps them small and standardized, which can help reduce data overfitting. From this article, regularization sounds favorable in many cases, but is it always encouraged? Are there scenarios in which it…

neural-networks regularization weights l2-regularization l1-regularization

asked Dec 21 '20 at 20:41

mark mark

753
4
23

4

votes

1 answer

How are the parameters of the Bernoulli distribution learned?

In the paper Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask, they learn a mask for the network by setting up the mask parameters as $M_i = Bern(\sigma(v_i))$. Where $M$ is the parameter mask ($f(x;\theta, M) = f(x;M \odot \theta$),…

machine-learning probability-distribution weights

asked Jul 05 '19 at 13:55

mshlis

2,349
7
23

3

votes

1 answer

What is the significance of weights in a feedforward neural network?

In a feedforward neural network, the inputs are fed directly to the outputs via a series of weights. What purpose do the weights serve, and how are they significant in this neural network?

neural-networks weights perceptron feedforward-neural-networks

asked Aug 02 '16 at 17:06

kenorb

10,423
3
43
91

3

votes

0 answers

Are there neural networks with (hard) constraints on the weights?

I don't know too much about Deep Learning, so my question might be silly. However, I was wondering whether there are NN architectures with some hard constraints on the weights of some layers. For example, let $(W^k_{ij})_{ij}$ be the weights of the…

neural-networks deep-learning deep-neural-networks weights weight-normalization

asked Nov 25 '21 at 21:35

Onil90

173
5

Questions tagged [weights]