DeepMind has published a lot of works on deep learning in the last years, most of them are state-of-the-art on their respective tasks. But how much of this work has actually been reproduced by the AI community? For instance, the Neural Turing Machine paper seems to be very hard to reproduce, according to other researchers.
-
I'm not sure about reproducing the original paper's results, but I've run into around a half-dozen papers that follow up on Graves et al.'s work which have produced results of the caliber. Most are on variants of the NTM theme. I can post some links if that would help. – SQLServerSteve Oct 27 '16 at 20:19
-
1This comment + links would be a good actual answer. – rcpinto Oct 28 '16 at 01:01
-
I'll convert it into an answer shortly, as soon as I can hunt up the web addresses again. – SQLServerSteve Oct 28 '16 at 08:08
2 Answers
On the suggestion of the O.P. rcpinto I converted a comment about seeing "around a half-dozen papers that follow up on Graves et al.'s work which have produced results of the caliber" and will provide a few links. Keep in mind that this only answers the part of the question pertaining to NTMs, not Google DeepMind itself, plus I'm still learning the ropes in machine learning, so some of the material in these papers is over my head; I did manage to grasp much of the material in Graves et al.'s original paper{1] though and am close to having homegrown NTM code to test. I also at least skimmed the following papers over the last few months; they do not replicate the NTM study in a strict scientific manner, but many of their experimental results do tend to support the original at least tangentially:
• In this paper on a variant version of NTM addressing, Gulcehere, et al. do not try to precisely replicate Graves et al.'s tests, but like the DeepMind team, it does demonstrate markedly better results for the original NTM and several variants over an ordinary recurrent LSTM. They use 10,000 training samples of a Facebook Q&A dataset, rather than the N-grams Graves et al. operated on in their paper, so it's not replication in the strictest sense. They did however manage to get a version of the original NTM and several variants up and running, plus recorded the same magnitude of performance improvement.2
• Unlike the original NTM, this study tested a version of reinforcement learning which was not differentiable; that may be why they were unable to solve several of the programming-like tasts, like Repeat-Copy, unless the controller wasn't confined to moving forwards. Their results were nevertheless good enough to lend support to the idea of NTMs. A more recent revision of their paper is apparently available, which I have yet to read, so perhaps some of their variant's problems have been solved.3
• Instead of testing the original flavor of NTM against ordinary neural nets like LSTMs, this paper pitted it against several more advanced NTM memory structures. They got good results on the same type of programming-like tasks that Graves et al. tested, but I don't think they were using the same dataset (it's hard to tell from the way their study is written just what datasets they were operating on).4
• On p. 8 of this study, an NTM clearly outperforms several LSTM, feed-forward and nearest-neighbor based schemes on an Omniglot character recognition dataset. An alternative approach to external memory cooked up by the authors clearly beats it, but it still obviously performs well. The authors seem to belong to a rival team at Google, so that might be an issue when assessing replicability.5
• On p. 2 these authors reported getting better generalization on "very large sequences" in a test of copy tasks, using a much smaller NTM network they evolved with the genetic NEAT algorithm, which dynamically grows topologies.6
NTMs are fairly new so there hasn't been much time to stringently replicate the original research yet, I suppose. The handful of papers I skimmed over the summer, however, seem to lend support to their experimental results; I have yet to see any that report anything but excellent performance. Of course I have an availability bias, since I only read the pdfs I could easily find in a careless Internet search. From that small sample it seems that most of the follow-up research has been focused on extending the concept, not replication, which would explain the lack of replicability data. I hope that helps.
1 Graves, Alex; Wayne, Greg and Danihelka, Ivo, 2014, "Neural Turing Machines," published Dec. 10, 2014.
2 Gulcehre, Caglar; Chandar, Sarath; Choy, Kyunghyun and Bengio, Yoshua, 2016, "Dynamic Neural Turing machine with Soft and Hard Addressing Schemes," published June 30, 2016.
3 Zaremba, Wojciech and Sutskever, Ilya, 2015, "Reinforcement Learning Neural Turing Machines," published May 4, 2015.
4 Zhang; Wei; Yu, Yang and Zhou, Bowen, 2015, "Structured Memory for Neural Turing Machines," published Oct. 25, 2015.
5 Santoro, Adam; Bartunov, Sergey; Botvinick, Matthew; Wierstra, Daan and Lillicrap, Timothy, 2016, "One-Shot Learning with Memory-Augmented Neural Networks," published May 19, 2016.
6 Boll Greve, Rasmus; Jacobsen, Emil Juul and Sebastian Risi, date unknown, "Evolving Neural Turing Machines." No publisher listed
All except (perhaps) Boll Greve et al. were published at the Cornell Univeristy Library arXiv.org Repository: Ithaca, New York.

- 505
- 1
- 8
- 12
I tend to think this question is border-line and may get close. A few comments for now, though.
wrongx There are (at least) two issues with reproducing the work of a company like DeepMind:
- Technicalities missing from publications.
- Access to the same level of data.
Technicalities should be workable. Some people have reproduced some of the Atari gaming stunts. AlphaGo is seemingly more complex and will require more work, yet that should be feasible at some point in the future (individuals may lack computing resources today).
Data can be more tricky. Several companies open their data sets, but data is also the nerve of the competition...

- 1,490
- 10
- 21
-
I'm actually trying to find those borders... would you say it is off-topic? Too broad? Or what? – rcpinto Aug 04 '16 at 08:02
-
I am not decided yet. I wonder what matters w.r.t. AI if we can or cannot reproduce some company's claims. I can see people questioning themselves about it and come here to get some answers, but we are not really talking about AI. Your question is still young. Let's see the community's decision. I find it "border-line acceptable". – Eric Platon Aug 04 '16 at 08:09