Most Popular
1500 questions
5
votes
3 answers
How widely accepted is the definition of intelligence by Marcus Hutter & Shane Legg?
I came across several papers by M. Hutter & S. Legg.
Especially this one:
Universal Intelligence: A Definition of Machine Intelligence, Shane Legg, Marcus Hutter
Given that it was published back in 2007, how much recognition or agreement has it…

Aether
- 265
- 2
- 7
5
votes
1 answer
What is the intuition behind variational inference for Bayesian neural networks?
I'm trying to understand the concept of Variational Inference for BNNs. My source is this work. The aim is to minimize the divergence between the approx. distribution and the true posterior
$$\text{KL}(q_{\theta}(w)||p(w|D) = \int q_{\theta}(w) \…

f_3464gh
- 99
- 6
5
votes
1 answer
How to deal with losses on different scales in multi-task learning?
Say I'm training a model for multiple tasks by trying to minimize sum of losses $L_1 + L_2$ via gradient descent.
If these losses are on a different scale, the one whose range is greater will dominate the optimization. I'm currently trying to fix…

SpiderRico
- 960
- 8
- 18
5
votes
1 answer
Correcting 'bad' translations in a sequence-to-sequence neural machine translation model
In working with basic sequence-to-sequence models for machine translation I have been able to achieve decent results. But inevitably some translations are not optimal or just flat-out incorrect. I am wondering if there is some way of "correcting"…

jrthom18
- 51
- 3
5
votes
2 answers
What is the difference between a language model and a word embedding?
I am self-studying applications of deep learning on the NLP and machine translation.
I am confused about the concepts of "Language Model", "Word Embedding", "BLEU Score".
It appears to me that a language model is a way to predict the next word given…

Exploring
- 223
- 6
- 16
5
votes
1 answer
Is the 'direction' considered, when determining the branching factor in bidirectional search?
If I am correct, the branching factor is the maximum number of successors of any node.
When I am applying bidirectional search to a transition graph like this one below
If 11 is the goal state and I start going backwards, is 10 considered as a…

Artery
- 153
- 7
5
votes
1 answer
Why did the developement of neural networks stop between 50s and 80s?
In a video lecture on the development of neural networks and the history of deep learning (you can start from minute 13), the lecturer (Yann LeCunn) said that the development of neural networks stopped until the 80s because people were using the…

Daviiid
- 563
- 3
- 15
5
votes
1 answer
How would I compute the optimal state-action value for a certain state and action?
I am currently trying to learn reinforcement learning and I started with the basic gridworld application. I tried Q-learning with the following parameters:
Learning rate = 0.1
Discount factor = 0.95
Exploration rate = 0.1
Default reward = 0
The…

Rim Sleimi
- 215
- 1
- 6
5
votes
1 answer
Why does Q-learning converge under 100% exploration rate?
I am working on this assignment where I made the agent learn state-action values (Q-values) with Q-learning and 100% exploration rate. The environment is the classic gridworld as shown in the following picture.
Here are the values of my…

Rim Sleimi
- 215
- 1
- 6
5
votes
1 answer
Can CNNs be made robust to tricks where small changes cause misclassification?
I while ago I read that you can make subtle changes to an image that will ensure a good CNN will horribly misclassify the image. I believe the changes must exploit details of the CNN that will be used for classification. So we can trick a good CNN…

Ted Ersek
- 153
- 2
5
votes
5 answers
How would AI be able to self-examine?
As I see some cases of machine-learning based artificial intelligence, I often see they make critical mistakes when they face inexperienced situations.
In our case, when we encounter totally new problems, we acknowledge that we are not skilled…

A Cat Named Tiger
- 227
- 2
- 5
5
votes
2 answers
Why is tf.abs non-differentiable in Tensorflow?
I understand why tf.abs is non-differentiable in principle (discontinuity at 0) but the same applies to tf.nn.relu yet, in case of this function gradient is simply set to 0 at 0. Why the same logic is not applied to tf.abs? Whenever I tried to use…

zedsdead
- 53
- 3
5
votes
1 answer
Clarifying representation of Neural Nerwork input for Chess Alpha Zero
In the Alpha Zero paper (https://arxiv.org/pdf/1712.01815.pdf) page 13, the input for the NN is described. In the beggining of the page, the authors state that:
"The input to the Neural Network is an N x X x (MT + L) image stack [...]"
From this, I…

Andrew
- 63
- 4
5
votes
2 answers
Is it practical to train AlphaZero or MuZero (for indie games) on a personal computer?
Is it practical/affordable to train an AlphaZero/MuZero engine using a residential gaming PC, or would it take thousands of years of training for the AI to learn enough to challenge humans?
I'm having trouble wrapping my head around how much…

Luke W
- 53
- 3
5
votes
1 answer
What happens if 2 genes have the same connection but a different innovation number?
I have read the Evolving Neural Networks through Augmenting Topologies (NEAT) paper, but some doubts are still bugging me, so I have two questions.
When do mutations occur? Between which nodes?
When mating, what happens if 2 genes have the same…

Miemels
- 389
- 2
- 10