Highest Voted Questions - Artificial Intelligence Stack Exchange

5

votes

1 answer

How to detect vanishing gradients?

Can vanishing gradients be detected by the change in distribution (or lack thereof) of my convolution's kernel weights throughout the training epochs? And if so how? For example, if only 25% of my kernel's weights ever change throughout the epochs,…

deep-learning convolutional-neural-networks deep-neural-networks vanishing-gradient-problem

asked Feb 19 '20 at 19:12

Elegant Code

153
1
7

5

votes

1 answer

How to define an action space when an agent can take multiple sub-actions in a step?

I'm attempting to design an action space in OpenAI's gym and hitting the following roadblock. I've looked at this post which is closely related but subtly different. The environment I'm writing needs to allow an agent to make between $1$ and $n$…

reinforcement-learning ai-design open-ai gym

asked Feb 19 '20 at 07:40

Seyed Moein Ayyoubzadeh

130
8

5

votes

1 answer

Why not more TD() in actor-critic algorithms?

Is there either an empirical or theoretical reason that actor-critic algorithms with eligibility traces have not been more fully explored? I was hoping to find a paper or implementation or both for continuous tasks (not episodic) in continuous…

reinforcement-learning actor-critic-methods temporal-difference-methods eligibility-traces td-lambda

asked Feb 17 '20 at 06:37

Nick Kunz

145
1
5

5

votes

1 answer

Is there a reason to use TensorFlow over PyTorch for research purposes?

I've been using PyTorch to do research for a while and it seems to be quite easy to implement new things with. Also, it is easy to learn and I didn't have any problem with following other researchers code so far. However, I wonder whether…

deep-learning tensorflow comparison research pytorch

asked Feb 12 '20 at 01:53

SpiderRico

960
8
18

5

votes

1 answer

Is the LSTM component a neuron or a layer?

Given the standard illustrative feed-forward neural net model, with the dots as neurons and the lines as neuron-to-neuron connection, what part is the (unfold) LSTM cell (see picture)? Is it a neuron (a dot) or a layer?

neural-networks deep-learning long-short-term-memory

asked Feb 08 '20 at 14:23

MScott

445
4
12

5

votes

1 answer

How powerful is OpenAI's Gym and Universe in board games area?

I'm a big fan of computer board games and would like to make Python chess/go/shogi/mancala programs. Having heard of reinforcement learning, I decided to look at OpenAI Gym. But first of all, I would like to know, is it possible using OpenAI…

reinforcement-learning open-ai chess gym go

asked Jan 26 '20 at 10:20

Taissa

63
4

5

votes

2 answers

What are examples of approaches to dimensionality reduction of feature vectors?

Given a pre-trained CNN model, I extract feature vector of images in reference and query dataset with several thousands of elements. I would like to apply some augmentation techniques to reduce the feature vector dimension to speed up cosine…

convolutional-neural-networks representation-learning principal-component-analysis dimensionality-reduction

asked Jan 23 '20 at 11:40

Farid Alijani

299
3
10

5

votes

2 answers

In deep learning, is it possible to use discontinuous activation functions?

In deep learning, is it possible to use discontinuous activation functions (e.g. one with jump discontinuity)? (My guess: for example, ReLU is non-differentiable at a single point, but it still has a well-defined derivative. If an activation…

deep-learning backpropagation optimization activation-functions relu

asked Jan 22 '20 at 04:40

Gyeonghoon Ko

51
2

5

votes

1 answer

Which deep learning models are suitable for image-to-image mapping?

I am working on a problem in which I need to train a neural network to map one or more input images to one or more output images (1 channel for image). Below I report some examples of input&output. In this case I report 1 input and 1 output image,…

convolutional-neural-networks computer-vision

asked Jan 21 '20 at 09:08

Giulio Ortali

81
5

5

votes

1 answer

Autoencoder produces repeated artifacts after convergence

As experiment, I have tried using an autoencoder to encode height data from the alps, however the decoded image is very pixellated after training for several hours as show in the image below. This repeating patter is larger than the final kernel…

convolutional-neural-networks autoencoders image-processing

asked Jan 08 '20 at 22:01

Yadeses

231
2
5

5

votes

1 answer

Why is a softmax used rather than dividing each activation by the sum?

Just wondering why a softmax is typically used in practice on outputs of most neural nets rather than just summing the activations and dividing each activation by the sum. I know it's roughly the same thing but what is the mathematical reasoning…

neural-networks activation-functions

asked Jan 01 '20 at 21:31

user8714896

717
1
4
21

5

votes

1 answer

Why do we average gradients and not loss in distributed training?

I'm running some distributed trainings in Tensorflow with Horovod. It runs training separately on multiple workers, each of which uses the same weights and does forward pass on unique data. Computed gradients are averaged within the communicator…

tensorflow distributed-computing

asked Dec 31 '19 at 13:22

pSoLT

161
2

5

votes

1 answer

Is running more epochs really a direct cause of overfitting?

I've seen some comments in online articles/tutorials or Stack Overflow questions which suggest that increasing the number of epochs can result in overfitting. But my intuition tells me that there should be no direct relationship at all between the…

neural-networks gradient-descent overfitting capacity

asked Dec 28 '19 at 15:39

Alexander Soare

1,319
2
11
26

5

votes

1 answer

What is a "batch" in batch normalization?

I'm working on an example of CNN with the MNIST hand-written numbers dataset. Currently I've got convolution -> pool -> dense -> dense, and for the optimiser I'm using Mini-Batch Gradient Descent with a batch size of 32. Now this concept of batch…

neural-networks batch-normalization

asked Dec 27 '19 at 10:23

Alexander Soare

1,319
2
11
26

5

votes

2 answers

How to understand the concept of self-supervised learning in AI?

I am new to self-supervised learning and it all seems a little magical at the moment. The only way I can get an intuitive understanding is to assume that, for real-world problems, features are still embedded at a per-object level. For example, to…

self-supervised-learning representation-learning

asked Dec 24 '19 at 08:11

user3546025

51
2

Most Popular