In deep learning, models may learn the probability distribution that generated the dataset. Observe the following paragraph from Chapter 5: Machine Learning Basics from the book titled Deep Learning (by Aaron Courville et al.)
Unsupervised learning algorithms experience a dataset containing many features, then learn useful properties of the structure of this dataset. In the context of deep learning, we usually want to learn the entire probability distribution that generated a dataset, whether explicitly, as in density estimation, or implicitly, for tasks like synthesis or denoising. Some other unsupervised learning algorithms perform other roles, like clustering, which consists of dividing the dataset into clusters of similar examples.
I read about density estimation in the same chapter, as given below
In the density estimation problem, the machine learning algorithm is asked to learn a function $p_{model} : R^n \rightarrow R$, where $p_{model}$(x) can be interpreted as a probability density function (if $x$ is continuous) or a probability mass function (if $x$ is discrete) on the space that the examples were drawn from.
This question is focused on explicit probability density estimation in continuous case i.e., learning density function $p_{model}$ directly.
Suppose I have a dataset $D$ with $n$ continuous random variables (features) $X_1, X_2, X_3,\cdots, X_n$. And I don't know anything about the probability density function of individual random variables. That is, I don't know about any information about any $X_i$ , such as, whether $X_i$ follows normal distribution or any other distribution. Then, is it possible to learn density function explicitly? Or do I need to provide some necessary information such as the class of probability distribution function to be learned?
I am thinking as follows:
If I have some information about $X_i$, such as: $X_i$ falls to a well known distribution, then I can learn the parameters of the underlying density function from $D$. So, is it mandatory to know some information about the underlying probability density function.