Most Popular

1500 questions
14
votes
3 answers

How does noise affect generalization?

Does increasing the noise in data help to improve the learning ability of a network? Does it make any difference or does it depend on the problem being solved? How is it affect the generalization process overall?
kenorb
  • 10,423
  • 3
  • 43
  • 91
14
votes
2 answers

Should deep residual networks be viewed as an ensemble of networks?

The question is about the architecture of Deep Residual Networks (ResNets). The model that won the 1-st places at "Large Scale Visual Recognition Challenge 2015" (ILSVRC2015) in all five main tracks: ImageNet Classification: “Ultra-deep” (quote…
14
votes
3 answers

Has anyone thought about making a neural network ask questions, instead of only answering them?

Most of the people is trying to answer question with a neural network. However, has anyone came up with some thoughts about how to make neural network ask questions, instead of answer questions? For example, if a CNN can decide which category an…
cha
  • 141
  • 5
14
votes
7 answers

Is consciousness necessary for any AI task?

Consciousness is challenging to define, but for this question let's define it as "actually experiencing sensory input as opposed to just putting a bunch of data through an inanimate machine." Humans, of course, have minds; for normal computers, all…
Ben N
  • 2,579
  • 2
  • 20
  • 35
14
votes
1 answer

What are the state-of-the-art results on the generalization ability of deep learning methods?

I've read a few classic papers on different architectures of deep CNNs used to solve varied image-related problems. I'm aware there's some paradox in how deep networks generalize well despite seemingly overfitting training data. A lot of people in…
14
votes
2 answers

How should I encode the structure of a neural network into a genome?

For a deterministic problem space, I need to find a neural network with the optimal node and link structure. I want to use a genetic algorithm to simulate many neural networks to find the best network structure for the problem domain. I've never…
14
votes
3 answers

Is there a way to understand neural networks without using the concept of brain?

Is there a way to understand, for instance, a multi-layered perceptron without hand-waving about them being similar to brains, etc? For example, it is obvious that what a perceptron does is approximating a function; there might be many other ways,…
14
votes
1 answer

What are the implications of the "No Free Lunch" theorem for machine learning?

The No Free Lunch (NFL) theorem states (see the paper Coevolutionary Free Lunches by David H. Wolpert and William G. Macready) any two algorithms are equivalent when their performance is averaged across all possible problems Is the "No Free Lunch"…
user9947
14
votes
3 answers

What is the relationship between the size of the hidden layer and the size of the cell state layer in an LSTM?

I was following some examples to get familiar with TensorFlow's LSTM API, but noticed that all LSTM initialization functions require only the num_units parameter, which denotes the number of hidden units in a cell. According to what I have learned…
14
votes
4 answers

What is the relevance of AIXI on current artificial intelligence research?

From Wikipedia: AIXI ['ai̯k͡siː] is a theoretical mathematical formalism for artificial general intelligence. It combines Solomonoff induction with sequential decision theory. AIXI was first proposed by Marcus Hutter in 2000[1] and the results…
rcpinto
  • 2,089
  • 1
  • 16
  • 31
14
votes
2 answers

What is the difference between graph convolution in the spatial vs spectral domain?

I've been reading different papers regarding graph convolution and it seems that they come into two flavors: spatial and spectral. From what I can see the main difference between the two approaches is that for spatial you're directly multiplying the…
14
votes
3 answers

What sort of mathematical problems are there in AI that people are working on?

I recently got a 18-month postdoc position in a math department. It's a position with relative light teaching duty and a lot of freedom about what type of research that I want to do. Previously I was mostly doing some research in probability and…
faceclean
  • 261
  • 2
  • 8
14
votes
4 answers

Did Minsky and Papert know that multi-layer perceptrons could solve XOR?

In their famous book entitled Perceptrons: An Introduction to Computational Geometry, Minsky and Papert show that a perceptron can't solve the XOR problem. This contributed to the first AI winter, resulting in funding cuts for neural networks.…
rcpinto
  • 2,089
  • 1
  • 16
  • 31
14
votes
2 answers

How large should the replay buffer be?

I'm learning DDPG algorithm by following the following link: Open AI Spinning Up document on DDPG, where it is written In order for the algorithm to have stable behavior, the replay buffer should be large enough to contain a wide range of…
14
votes
3 answers

Why exactly do neural networks require i.i.d. data?

In reinforcement learning, successive states (actions and rewards) can be correlated. An experience replay buffer was used, in the DQN architecture, to avoid training the neural network (NN), which represents the $Q$ function, with correlated (or…