Highest Voted Questions - Artificial Intelligence Stack Exchange

5

votes

2 answers

Why am I getting the incorrect value of lambda?

I am trying to solve for $\lambda$ using temporal-difference learning. More specifically, I am trying to figure out what $\lambda$ I need, such that $\text{TD}(\lambda)=\text{TD}(1)$, after one iteration. But I get the incorrect value of…

reinforcement-learning python markov-decision-process temporal-difference-methods td-lambda

asked May 20 '19 at 05:58

Amanda

205
1
5

5

votes

1 answer

How define a reward function for a humanoid agent whose goal is to stand up from the ground?

I'm trying to teach a humanoid agent how to stand up after falling. The episode starts with the agent lying on the floor with its back touching the ground, and its goal is to stand up in the shortest amount of time. But I'm having trouble in regards…

reinforcement-learning reward-functions reward-design reward-shaping

asked May 17 '19 at 17:05

Tirafesi

151
1

5

votes

1 answer

A3C fails to solve MountainCar-v0 enviroment (implementation by OpenAi gym)

While I've been able to solve MountainCar-v0 using Deep Q learning, no matter what I try I can't solve this enviroment using policy-gradient approaches. As far as I learnt searching the web, this is a really hard enviroment to solve, mainly because…

reinforcement-learning

asked May 06 '19 at 06:32

Scorpio76

61
2

5

votes

1 answer

Is there any paper, article or book that analyzes the feasibility of acheiving AGI through brain-simulation?

In my understanding, the mind arises from a physical system, the brain. I see that there is a big research under the topic of simulating physical systems efficiently (especially in quantum computing). Hence, in theory, we could achieve AGI by…

agi human-like brain human-inspired

asked May 05 '19 at 18:08

olinarr

745
6
20

5

votes

2 answers

Reinforcement learning with uniformly random dynamics

Suppose I have an MDP $(S, A, p, R)$ where the $p(s_j|s_i,a_i)$ is uniform, i.e given an state $s_i$ and an action $a_i$ all states $s_j$ are equally probable. Now I want to find an optimal policy for this MDP. Can I just apply the usual methods…

reinforcement-learning markov-decision-process

asked May 04 '19 at 23:40

grok

151
3

5

votes

2 answers

Are self-driving cars using single frame or multiple frame to make decision?

This might be a trivial question but I couldn't find any reliable answers on the internet. Almost all the neural network architectures for self-driving cars that I have seen on the internet have a feedforward network, previous frames will not help…

autonomous-vehicles

asked May 02 '19 at 21:15

Tamilarasu Ulaganathan

61
1

5

votes

1 answer

Name of paper for encoding/representing XY coordinates in deep learning

It this podcast between Oriol Vinyals and Lex Friedman: https://youtu.be/Kedt2or9xlo?t=1769, at 29:29, Oriol Vinyals refers to a paper: If you look at research in computer vision where it makes a lot of sense to treat images as two dimensional…

neural-networks deep-learning computer-vision papers

asked May 01 '19 at 16:29

Benjamin Crouzier

311
2
6

5

votes

2 answers

Neural Network with varying inputs (for a game ai)

I want to create a simple game which basically consists of 2d circles shooting smaller circles at each other (to make hitbox detection easier for the start). My goal is to create an ai which adapts its own behaviour to the player‘s. For that, i want…

neural-networks game-ai neurons java

asked Apr 26 '19 at 13:30

Cr3ative

53
4

5

votes

2 answers

How does Lucas's argument work?

In Minds, Machines and Gödel (1959), J. R. Lucas shows that any human mathematician can not be represented by an algorithmic automaton (a Turing Machine, but any computer is equivalent to it by the Church-Turing thesis), using Gödel's incompleteness…

philosophy agi artificial-consciousness incompleteness-theorems turing-machine

asked Aug 02 '16 at 19:31

wythagoras

1,511
12
27

5

votes

4 answers

How to stop DQN Q function from increasing during learning?

Following the DQN algorithm with experience replay: Store transition $\left(\phi_{t}, a_{t}, r_{t}, \phi_{t+1}\right)$ in $D$ Sample random minibatch of transitions $\left(\phi_{j}, a_{j}, r_{j}, \phi_{j+1}\right)$ from $D$…

reinforcement-learning q-learning dqn objective-functions value-functions

asked Apr 24 '19 at 14:15

BestR

183
1
7

5

votes

1 answer

Is it possible to make a 'forked path' neural network?

I want to make a network, specifically a CNN for image recognition, that takes an input, processes it the same way for several layers, and then at some point splits before coming to two different outputs. Is it possible to create a network such as…

neural-networks machine-learning convolutional-neural-networks image-recognition

asked Apr 10 '19 at 17:54

Fred E

155
2

5

votes

1 answer

Understanding the n-step off-policy SARSA update

In Sutton & Barto's book (2nd ed) page 149, there is the equation 7.11 I am having a hard time understanding this equation. I would have thought that we should be moving $Q$ towards $G$, where $G$ would be corrected by importance sampling, but only…

reinforcement-learning sutton-barto off-policy-methods temporal-difference-methods sarsa

asked Apr 05 '19 at 14:23

Antoine Savine

153
4

5

votes

1 answer

Does backpropagation update weights one layer at a time?

I am new to Deep Learning. Suppose that we have a neural network with one input layer, one output layer, and one hidden layer. Let's refer to the weights from input to hidden as $W$ and the weights from hidden to output as $V$. Suppose that we have…

neural-networks deep-learning backpropagation gradient-descent

asked Apr 03 '19 at 23:36

Joshua Jones

53
3

5

votes

1 answer

Cold start collaborative filtering with NLP

I’m looking to match two pieces of text - e.g. IMDb movie descriptions and each person’s description of the type of movies they like. I have an existing set of ~5000 matches between the two. I particularly want to overcome the cold-start problem:…

natural-language-processing recommender-system

asked Apr 01 '19 at 04:53

Derek Hans

71
2

5

votes

2 answers

How to detect frauds in advertising business using machine learning?

I am very beginner to this world. I still learning the basics of Machine learning and AI but i have a problem at hand and i am not sure which technique or Algorithm can be applied on it. I am working on Click-Fraud detection in advertising. I need…

machine-learning models applications

asked Mar 28 '19 at 16:08

Mirza

61
4

Most Popular