Questions tagged [deepmind]

For questions about the theoretical, philosophical, social, historical, or academic aspects of AI that involve the company DeepMind. For example, you can ask questions about how AlphaGo works, but not how DeepMind promotes its achievements.

In July 2018, researchers from DeepMind trained one of its systems to play the computer game Quake III Arena.

In 2020, DeepMind made significant advances in the problem of protein folding.

According to the company's website, DeepMind Technologies' goal is to combine

"the best techniques from machine learning and systems neuroscience to build powerful general-purpose learning algorithms".

Products and technologies:

  • Deep reinforcement learning
  • AlphaGo and successors
  • AlphaFold
  • WaveNet and WaveRNN
  • AlphaStar
  • Miscellaneous contributions to Google: ("DeepMind AI Reduces Google Data Centre Cooling Bill by 40%") and (Android Adaptive Battery and Adaptive Brightness, use machine learning to conserve energy)
  • DeepMind Health
26 questions
12
votes
2 answers

How much of Deep Mind's work is actually reproducible?

DeepMind has published a lot of works on deep learning in the last years, most of them are state-of-the-art on their respective tasks. But how much of this work has actually been reproduced by the AI community? For instance, the Neural Turing…
10
votes
2 answers

Was DeepMind's DQN learning simultaneously all the Atari games?

DeepMind states that its deep Q-network (DQN) was able to continually adapt its behavior while learning to play 49 Atari games. After learning all games with the same neural net, was the agent able to play them all at 'superhuman' levels…
Dion
  • 203
  • 2
  • 6
9
votes
2 answers

What's the difference between Starcraft and Dota from an AI perspective?

So, Deepmind is pushing for a human level Starcraft bot and Open AI just created a human level 1vs1 Dota bot. Unfortunately, I've no clue what that signifies because I've never played Starcraft nor Dota nor do I have more than a fleeting…
BlindKungFuMaster
  • 4,185
  • 11
  • 23
6
votes
0 answers

Why Pixel RNN (Row LSTM) can capture triangular contexts?

I'm reading the paper Pixel Recurrent Neural Network. I have a question about Row LSTM. Why Row LSTM can capture triangular contexts? In this paper, the kernel of the one-dimensional convolution has size $k \times 1$ where $k \geq 3$; the larger…
5
votes
1 answer

Why did AlphaGo lose its Go game?

We can read on wiki page that in March 2016 AlphaGo AI lost its game (1 of 5) to Lee Sedol, a professional Go player. One article cite says: AlphaGo lost a game and we as researchers want to explore that and find out what went wrong. We need to…
kenorb
  • 10,423
  • 3
  • 43
  • 91
4
votes
2 answers

Is AlphaFold just making a good estimate of the protein structure?

In the news, DeepMind's AlphaFold is said to have solved the protein folding problem using neural networks, but isn't this a problem only optimised quantum computers can solve? To my limited understating, the issue is that there are too many…
4
votes
1 answer

How does DeepMind perform reinforcement learning on a TPU?

I've watched this video of the recent contest of AlphaStar Vs Pro players of StarCraft2, and during the discussion David Silver of DeepMind said that they train AlphaStar on TPUs. My question is, how is it possible to utilise a GPU or TPU for…
BigBadMe
  • 403
  • 2
  • 9
4
votes
2 answers

Each training run for DDQN agent takes 2 days, and still ends up with -13 avg score, but OpenAi baseline DQN needs only an hour to converge to +18?

Status: For a few weeks now, I have been working on a Double DQN agent for the PongDeterministic-v4 environment, which you can find here. A single training run lasts for about 7-8 million timesteps (about 7000 episodes) and takes me about 2 days, on…
3
votes
1 answer

What is regression layer in a spatial transformer?

I came across this line while reading the original paper on Spatial Transformers by Deepmind in the last paragraph of Sec 3.1: The localisation network function floc() can take any form, such as a fully-connected network or a convolutional network,…
nivter
  • 73
  • 3
3
votes
1 answer

How does policy network learn in AlphaZero?

I'm currently trying to understand how AlphaZero works. There is one thing with the training of the AlphaZero's policy head that confuses me. Basically, in AlphaGo Zero's paper (where the major part of AlphaZero algorithm is explained) a combined…
3
votes
0 answers

Where does reinforcement learning actually show up in Deepmind's game engines?

From the brief research I've done on the topic, it appears that the way Deepmind's Alphazero or Muzero makes decisions is through Monte Carlo tree searches, where in the randomized simulations allows for a more rapid way to make calculations than…
3
votes
2 answers

How does the AlphaGo Zero policy decide what move to execute?

I was going through the AlphaGo Zero paper and I was trying to understand everything, but I just can't figure out this one formula: $$ \pi(a \mid s_0) = \frac{N(s_0, a)^{\frac{1}{\tau}}}{\sum_b N(s_0, b)^{\frac{1}{\tau}}} $$ Could someone decode how…
2
votes
1 answer

Is it common in RL research with Atari/ALE to automatically press FIRE to start games?

In some Atari games in the Arcade Learning Environment (ALE), it is necessary to press FIRE once to start a game. Because it may be difficult for a Reinforcement Learning (RL) agent to learn this, they may often waste a lot of time executing actions…
Dennis Soemers
  • 9,894
  • 2
  • 25
  • 66
2
votes
0 answers

AlphaGo Zero: Does the policy head give a probability for every possible move?

If I understood correctly, the AlphaGo Zero network returns two values: a vector of logit probabilities p and a value v. My question is: in this vector that it is outputted, do we have a probability for every possible action in the game? If so: does…
ihavenoidea
  • 255
  • 2
  • 11
1
vote
0 answers

How does AlphaTensor use self-play to discover efficient matrix multiplication algorithms?

Prior to the development of AlphaTensor, one of the main challenges in discovering new algorithms was the vast number of possibilities to consider & there are often an enormous number of potential algorithms that could be developed to solve a given…
Faizy
  • 1,074
  • 1
  • 6
  • 30
1
2