Most Popular

1500 questions
17
votes
1 answer

How can policy gradients be applied in the case of multiple continuous actions?

Trusted Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO) are two cutting edge policy gradients algorithms. When using a single continuous action, normally, you would use some probability distribution (for example, Gaussian)…
17
votes
2 answers

Are softmax outputs of classifiers true probabilities?

BACKGROUND: The softmax function is the most common choice for an activation function for the last dense layer of a multiclass neural network classifier. The outputs of the softmax function have mathematical properties of probabilities and are--in…
17
votes
4 answers

How to reinvent jobs replaced by AI?

In general, what possibilities are there for reinventing job descriptions that could be replaced by an automated AI solution? My initial ideas include: Monitoring the AI and flagging its incorrect actions. Possibly taking over the control in very…
tuomastik
  • 221
  • 2
  • 10
17
votes
2 answers

When is deep learning overkill?

For example, for classifying emails as spam, is it worthwhile - from a time/accuracy perspective - to apply deep learning (if possible) instead of another machine learning algorithm? Will deep learning make other machine learning algorithms like…
17
votes
1 answer

Are information processing rules from Gestalt psychology still used in computer vision today?

Decades ago there were and are books in machine vision, which by implementing various information processing rules from gestalt psychology, got impressive results with little code or special hardware in image identification and visual…
17
votes
1 answer

What is a fully convolution network?

I was surveying some literature related to Fully Convolutional Networks and came across the following phrase, A fully convolutional network is achieved by replacing the parameter-rich fully connected layers in standard CNN architectures by…
17
votes
1 answer

What is the intuition behind the dot product attention?

I am watching the video Attention Is All You Need by Yannic Kilcher. My question is: what is the intuition behind the dot product attention? $$A(q,K, V) = \sum_i\frac{e^{q.k_i}}{\sum_j e^{q.k_j}} v_i$$ becomes: $$A(Q,K, V) = \text{softmax}(QK^T)V$$
DRV
  • 1,573
  • 2
  • 11
  • 18
17
votes
3 answers

How would an AI learn language?

I was think about AIs and how they would work, when I realised that I couldn't think of a way that an AI could be taught language. A child tends to learn language through associations of language and pictures to an object (e.g., people saying the…
AvahW
  • 275
  • 1
  • 8
17
votes
1 answer

How does "Monte-Carlo search" work?

I have heard about this concept in a Reddit post about AlphaGo. I have tried to go through the paper and the article, but could not really make sense of the algorithm. So, can someone give an easy-to-understand explanation of how the Monte-Carlo…
Dawny33
  • 1,371
  • 13
  • 29
17
votes
3 answers

What is geometric deep learning?

What is geometric deep learning (GDL)? Here are a few sub-questions How is it different from deep learning? Why do we need GDL? What are some applications of GDL?
16
votes
1 answer

Why do you not see dropout layers on reinforcement learning examples?

I've been looking at reinforcement learning, and specifically playing around with creating my own environments to use with the OpenAI Gym AI. I am using agents from the stable_baselines project to test with it. One thing I've noticed in virtually…
16
votes
3 answers

How to implement a variable action space in Proximal Policy Optimization?

I'm coding a Proximal Policy Optimization (PPO) agent with the Tensorforce library (which is built on top of TensorFlow). The first environment was very simple. Now, I'm diving into a more complex environment, where all the actions are not available…
16
votes
5 answers

What is the most general definition of "intelligence"?

When we talk about artificial intelligence, human intelligence, or any other form of intelligence, what do we mean by the term intelligence in a general sense? What would you call intelligent and what not? In other words, how do we define the term…
user79161
  • 359
  • 1
  • 12
16
votes
5 answers

Why are the initial weights of neural networks randomly initialised?

This might sound silly to someone who has plenty of experience with neural networks but it bothers me... Random initial weights might give you better results that would be somewhat closer to what a trained neural network should look like, but it…
16
votes
3 answers

Are there any applications of reinforcement learning other than games?

Is there a way to teach reinforcement learning in applications other than games? The only examples I can find on the Internet are of game agents. I understand that VNC's control the input to the games via the reinforcement network. Is it possible…