Most Popular
1500 questions
17
votes
1 answer
How can policy gradients be applied in the case of multiple continuous actions?
Trusted Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO) are two cutting edge policy gradients algorithms.
When using a single continuous action, normally, you would use some probability distribution (for example, Gaussian)…

Evalds Urtans
- 377
- 3
- 9
17
votes
2 answers
Are softmax outputs of classifiers true probabilities?
BACKGROUND: The softmax function is the most common choice for an activation function for the last dense layer of a multiclass neural network classifier. The outputs of the softmax function have mathematical properties of probabilities and are--in…

Snehal Patel
- 912
- 1
- 1
- 25
17
votes
4 answers
How to reinvent jobs replaced by AI?
In general, what possibilities are there for reinventing job descriptions that could be replaced by an automated AI solution?
My initial ideas include:
Monitoring the AI and flagging its incorrect actions.
Possibly taking over the control in very…

tuomastik
- 221
- 2
- 10
17
votes
2 answers
When is deep learning overkill?
For example, for classifying emails as spam, is it worthwhile - from a time/accuracy perspective - to apply deep learning (if possible) instead of another machine learning algorithm? Will deep learning make other machine learning algorithms like…

Alexander
- 293
- 1
- 8
17
votes
1 answer
Are information processing rules from Gestalt psychology still used in computer vision today?
Decades ago there were and are books in machine vision, which by implementing various information processing rules from gestalt psychology, got impressive results with little code or special hardware in image identification and visual…

Gottfried William
- 343
- 1
- 11
17
votes
1 answer
What is a fully convolution network?
I was surveying some literature related to Fully Convolutional Networks and came across the following phrase,
A fully convolutional network is achieved by replacing the parameter-rich fully connected layers in standard CNN architectures by…

r4bb1t
- 305
- 1
- 2
- 8
17
votes
1 answer
What is the intuition behind the dot product attention?
I am watching the video Attention Is All You Need by Yannic Kilcher.
My question is: what is the intuition behind the dot product attention?
$$A(q,K, V) = \sum_i\frac{e^{q.k_i}}{\sum_j e^{q.k_j}} v_i$$
becomes:
$$A(Q,K, V) = \text{softmax}(QK^T)V$$

DRV
- 1,573
- 2
- 11
- 18
17
votes
3 answers
How would an AI learn language?
I was think about AIs and how they would work, when I realised that I couldn't think of a way that an AI could be taught language. A child tends to learn language through associations of language and pictures to an object (e.g., people saying the…

AvahW
- 275
- 1
- 8
17
votes
1 answer
How does "Monte-Carlo search" work?
I have heard about this concept in a Reddit post about AlphaGo. I have tried to go through the paper and the article, but could not really make sense of the algorithm.
So, can someone give an easy-to-understand explanation of how the Monte-Carlo…

Dawny33
- 1,371
- 13
- 29
17
votes
3 answers
What is geometric deep learning?
What is geometric deep learning (GDL)?
Here are a few sub-questions
How is it different from deep learning?
Why do we need GDL?
What are some applications of GDL?

nbro
- 39,006
- 12
- 98
- 176
16
votes
1 answer
Why do you not see dropout layers on reinforcement learning examples?
I've been looking at reinforcement learning, and specifically playing around with creating my own environments to use with the OpenAI Gym AI. I am using agents from the stable_baselines project to test with it.
One thing I've noticed in virtually…

Matt Hamilton
- 293
- 2
- 5
16
votes
3 answers
How to implement a variable action space in Proximal Policy Optimization?
I'm coding a Proximal Policy Optimization (PPO) agent with the Tensorforce library (which is built on top of TensorFlow).
The first environment was very simple. Now, I'm diving into a more complex environment, where all the actions are not available…

Max
- 163
- 1
- 6
16
votes
5 answers
What is the most general definition of "intelligence"?
When we talk about artificial intelligence, human intelligence, or any other form of intelligence, what do we mean by the term intelligence in a general sense? What would you call intelligent and what not? In other words, how do we define the term…

user79161
- 359
- 1
- 12
16
votes
5 answers
Why are the initial weights of neural networks randomly initialised?
This might sound silly to someone who has plenty of experience with neural networks but it bothers me...
Random initial weights might give you better results that would be somewhat closer to what a trained neural network should look like, but it…

Matas Vaitkevicius
- 271
- 5
- 12
16
votes
3 answers
Are there any applications of reinforcement learning other than games?
Is there a way to teach reinforcement learning in applications other than games?
The only examples I can find on the Internet are of game agents. I understand that VNC's control the input to the games via the reinforcement network. Is it possible…

Mark Markrowave Charlton
- 357
- 2
- 9