Most Popular

1500 questions
19
votes
1 answer

What is the difference between tree search and graph search?

I have read various answers to this question at different places, but I am still missing something. What I have understood is that a graph search holds a closed list, with all expanded nodes, so they don't get explored again. However, if you apply…
xava
  • 423
  • 1
  • 3
  • 9
19
votes
4 answers

What activation function does the human brain use?

Does the human brain use a specific activation function? I've tried doing some research, and as it's a threshold for whether the signal is sent through a neuron or not, it sounds a lot like ReLU. However, I can't find a single article confirming…
mlman
  • 301
  • 2
  • 5
19
votes
2 answers

How to implement an "unknown" class in multi-class classification with neural networks?

For example, I need to detect classes for MNIST data. But I want to have not 10 classes for digits, but also I want to have 11th class "not a digit", so that any letter, any other type of image, or random noise would be classified as "not a digit".…
19
votes
5 answers

How can I design and train a neural network to play a card game (similar to Magic: The Gathering)?

Introduction I am currently writing an engine to play a card game, as there is no engine yet for this particular game. About the game The game is similar to Magic: The Gathering. There is a commander, which has health and abilities. Players have an…
pcaston2
  • 311
  • 1
  • 2
  • 5
19
votes
1 answer

Why has the cross-entropy become the classification standard loss function and not Kullback-Leibler divergence?

The cross-entropy is identical to the KL divergence plus the entropy of the target distribution. The KL divergence equals zero when the two distributions are the same, which seems more intuitive to me than the entropy of the target distribution,…
19
votes
1 answer

What is the number of neurons required to approximate a polynomial of degree n?

I learned about the universal approximation theorem from this guide. It states that a network even with a single hidden layer can approximate any function within some bound, given a sufficient number of neurons. Or mathematically, ${|g(x)−f(x)|<…
19
votes
2 answers

What are the main differences between skip-gram and continuous bag of words?

The skip-gram and continuous bag of words (CBOW) are two different types of word2vec models. What are the main differences between them? What are the pros and cons of both methods?
DRV
  • 1,573
  • 2
  • 11
  • 18
19
votes
11 answers

What purpose would be served by developing AI's that experience human-like emotions?

In a recent Wall Street Journal article, Yann LeCunn makes the following statement: The next step in achieving human-level ai is creating intelligent—but not autonomous—machines. The AI system in your car will get you safely home, but won’t choose…
mindcrime
  • 3,737
  • 14
  • 29
19
votes
3 answers

Are there any computational models of mirror neurons?

From Wikipedia: A mirror neuron is a neuron that fires both when an animal acts and when the animal observes the same action performed by another. Mirror neurons are related to imitation learning, a very useful feature that is missing in current…
rcpinto
  • 2,089
  • 1
  • 16
  • 31
19
votes
2 answers

What limits, if any, does the halting problem put on Artificial Intelligence?

Given the proven halting problem for Turing machines, can we infer limits on the ability of strong Artificial Intelligence?
WilliamKF
  • 2,493
  • 1
  • 24
  • 31
18
votes
4 answers

Where can I find the original paper that introduced RNNs?

I was able to find the original paper on LSTM, but I was not able to find the paper that introduced "vanilla" RNNs. Where can I find it?
18
votes
1 answer

How does LSTM in deep reinforcement learning differ from experience replay?

In the paper Deep Recurrent Q-Learning for Partially Observable MDPs, the author processed the Atari game frames with an LSTM layer at the end. My questions are: How does this method differ from the experience replay, as they both use past…
18
votes
4 answers

What is the difference between actor-critic and advantage actor-critic?

I'm struggling to understand the difference between actor-critic and advantage actor-critic. At least, I know they are different from asynchronous advantage actor-critic (A3C), as A3C adds an asynchronous mechanism that uses multiple worker agents…
18
votes
4 answers

Why do we need floats for using neural networks?

Is it possible to make a neural network that uses only integers by scaling input and output of each function to [-INT_MAX, INT_MAX]? Is there any drawbacks?
elimohl
  • 191
  • 1
  • 1
  • 5
18
votes
3 answers

How do I choose the best algorithm for a board game like checkers?

How do I choose the best algorithm for a board game like checkers? So far, I have considered only three algorithms, namely, minimax, alpha-beta pruning, and Monte Carlo tree search (MCTS). Apparently, both the alpha-beta pruning and MCTS are…