Consider the following statement(s) from Deep Learning book (p. 333, chapter 9: Convolutional Networks) by Ian Goodfellow et al.
Convolution is thus dramatically more efficient than dense matrix multiplication in terms of the memory requirements and statistical efficiency.
Book is saying that statistical efficiency is due to the decrease in the number of parameters due to convolution (using kernel) compared to fully connected feed forward neural networks.
What is meant by statistical efficiency in this context? And how does decrease in the number of parameters increase statistical efficiency?