Questions tagged [deepmind]

For questions about the theoretical, philosophical, social, historical, or academic aspects of AI that involve the company DeepMind. For example, you can ask questions about how AlphaGo works, but not how DeepMind promotes its achievements.

In July 2018, researchers from DeepMind trained one of its systems to play the computer game Quake III Arena.

In 2020, DeepMind made significant advances in the problem of protein folding.

According to the company's website, DeepMind Technologies' goal is to combine

"the best techniques from machine learning and systems neuroscience to build powerful general-purpose learning algorithms".

Products and technologies:

Deep reinforcement learning
AlphaGo and successors
AlphaFold
WaveNet and WaveRNN
AlphaStar
Miscellaneous contributions to Google: ("DeepMind AI Reduces Google Data Centre Cooling Bill by 40%") and (Android Adaptive Battery and Adaptive Brightness, use machine learning to conserve energy)
DeepMind Health

26 questions

votes

2 answers

How much of Deep Mind's work is actually reproducible?

DeepMind has published a lot of works on deep learning in the last years, most of them are state-of-the-art on their respective tasks. But how much of this work has actually been reproduced by the AI community? For instance, the Neural Turing…

asked Aug 04 '16 at 07:43

rcpinto

2,089
1
16
31

votes

2 answers

Was DeepMind's DQN learning simultaneously all the Atari games?

DeepMind states that its deep Q-network (DQN) was able to continually adapt its behavior while learning to play 49 Atari games. After learning all games with the same neural net, was the agent able to play them all at 'superhuman' levels…

reinforcement-learning deep-rl dqn deepmind atari-games

asked Oct 20 '16 at 01:42

Dion

votes

2 answers

What's the difference between Starcraft and Dota from an AI perspective?

So, Deepmind is pushing for a human level Starcraft bot and Open AI just created a human level 1vs1 Dota bot. Unfortunately, I've no clue what that signifies because I've never played Starcraft nor Dota nor do I have more than a fleeting…

game-ai deepmind

asked Aug 14 '17 at 20:59

BlindKungFuMaster

4,185
11
23

votes

0 answers

Why Pixel RNN (Row LSTM) can capture triangular contexts?

I'm reading the paper Pixel Recurrent Neural Network. I have a question about Row LSTM. Why Row LSTM can capture triangular contexts? In this paper, the kernel of the one-dimensional convolution has size $k \times 1$ where $k \geq 3$; the larger…

deep-learning recurrent-neural-networks long-short-term-memory deep-neural-networks deepmind

asked Apr 18 '20 at 15:33

musako

votes

1 answer

Why did AlphaGo lose its Go game?

We can read on wiki page that in March 2016 AlphaGo AI lost its game (1 of 5) to Lee Sedol, a professional Go player. One article cite says: AlphaGo lost a game and we as researchers want to explore that and find out what went wrong. We need to…

game-ai deepmind alphago

asked Aug 09 '16 at 12:00

kenorb

10,423
3
43
91

votes

2 answers

Is AlphaFold just making a good estimate of the protein structure?

In the news, DeepMind's AlphaFold is said to have solved the protein folding problem using neural networks, but isn't this a problem only optimised quantum computers can solve? To my limited understating, the issue is that there are too many…

neural-networks deep-learning deepmind quantum-computing alpha-fold

asked Dec 05 '20 at 00:58

Aaron

votes

1 answer

How does DeepMind perform reinforcement learning on a TPU?

I've watched this video of the recent contest of AlphaStar Vs Pro players of StarCraft2, and during the discussion David Silver of DeepMind said that they train AlphaStar on TPUs. My question is, how is it possible to utilise a GPU or TPU for…

machine-learning reinforcement-learning deepmind

asked Feb 01 '19 at 10:20

BigBadMe

votes

2 answers

Each training run for DDQN agent takes 2 days, and still ends up with -13 avg score, but OpenAi baseline DQN needs only an hour to converge to +18?

Status: For a few weeks now, I have been working on a Double DQN agent for the PongDeterministic-v4 environment, which you can find here. A single training run lasts for about 7-8 million timesteps (about 7000 episodes) and takes me about 2 days, on…

deep-learning reinforcement-learning dqn open-ai deepmind

asked Jan 30 '19 at 11:39

hridayns

votes

1 answer

What is regression layer in a spatial transformer?

I came across this line while reading the original paper on Spatial Transformers by Deepmind in the last paragraph of Sec 3.1: The localisation network function floc() can take any form, such as a fully-connected network or a convolutional network,…

neural-networks deep-learning deepmind transformer

asked Mar 11 '18 at 13:13

nivter

votes

1 answer

How does policy network learn in AlphaZero?

I'm currently trying to understand how AlphaZero works. There is one thing with the training of the AlphaZero's policy head that confuses me. Basically, in AlphaGo Zero's paper (where the major part of AlphaZero algorithm is explained) a combined…

reinforcement-learning alphazero alphago-zero deepmind

asked May 25 '21 at 09:31

Alberto M

votes

0 answers

Where does reinforcement learning actually show up in Deepmind's game engines?

From the brief research I've done on the topic, it appears that the way Deepmind's Alphazero or Muzero makes decisions is through Monte Carlo tree searches, where in the randomized simulations allows for a more rapid way to make calculations than…

deep-learning monte-carlo-tree-search chess alphazero deepmind

asked May 17 '20 at 17:44

Amar Srivastava

votes

2 answers

How does the AlphaGo Zero policy decide what move to execute?

I was going through the AlphaGo Zero paper and I was trying to understand everything, but I just can't figure out this one formula: $$ \pi(a \mid s_0) = \frac{N(s_0, a)^{\frac{1}{\tau}}}{\sum_b N(s_0, b)^{\frac{1}{\tau}}} $$ Could someone decode how…

reinforcement-learning policies deepmind alphago-zero alphago

asked May 07 '20 at 11:09

Eloi M.

votes

1 answer

Is it common in RL research with Atari/ALE to automatically press FIRE to start games?

In some Atari games in the Arcade Learning Environment (ALE), it is necessary to press FIRE once to start a game. Because it may be difficult for a Reinforcement Learning (RL) agent to learn this, they may often waste a lot of time executing actions…

reinforcement-learning research deepmind

asked Feb 22 '18 at 16:39

Dennis Soemers

9,894
2
25
66

votes

0 answers

AlphaGo Zero: Does the policy head give a probability for every possible move?

If I understood correctly, the AlphaGo Zero network returns two values: a vector of logit probabilities p and a value v. My question is: in this vector that it is outputted, do we have a probability for every possible action in the game? If so: does…

deep-learning policies deepmind alphago-zero

asked Nov 11 '19 at 01:33

ihavenoidea

vote

0 answers

How does AlphaTensor use self-play to discover efficient matrix multiplication algorithms?

Prior to the development of AlphaTensor, one of the main challenges in discovering new algorithms was the vast number of possibilities to consider & there are often an enormous number of potential algorithms that could be developed to solve a given…

reinforcement-learning deep-rl deepmind

asked Dec 19 '22 at 09:10

Faizy

1,074
1
6
30

2 Next