I am trying to understand why mean is used for expectation in training Generative Adversarial Networks.
The answer tells that it is due to the law of large numbers which is based on the assumption that random variables are independent and identically distributed.
If I have a dataset of all possible $32 \times 32$ grayscale images. Then my sample space consists of $256^{32 \times 32}$ elements. Suppose I define 1024 random variables as
$$X_i = \text{ intensity of } i^{th} \text{ pixel for } 1 \le i \le 1024$$
Then it is clear that all the random variables are iid since
- $X_i \perp X_j$ for all $i, j$ such that $i \ne j$ and
- $p(X_i = k) = \dfrac{1}{256}$ for all $i$
But these properties do not hold if I take a dataset of (say flower) images since pixel intensities are not independent of each other and the intensity values are not uniformly distributed as well.
Then how can the law of large numbers be applicable for GAN as the dataset (sample space) does not cover all the possible elements? If I am wrong, then what is the sample space they are considering and what are the random variables they are using implicitly that leads to the satisfaction of iid condition and then the law of large numbers?