3

The basic idea of MFA is to perform subspace clustering by assuming the covariance structure for each component of the form, $\Sigma_i = \Lambda_i \Lambda_i^T + \Psi_i$, where $\Lambda_i \in \mathbb{R}^{D\times d}$, is the factor loadings matrix with $d < D$ for parsimonious representation of the data, and $Ψ_i$ is the diagonal noise matrix. Note that the mixture of probabilistic principal component analysis (MPPCA) model is a special case of MFA with the distribution of the errors assumed to be isotropic with $Ψ_i = Iσ_i^2$.

What is meant by subspace clustering here, and how does $\Sigma_i = \Lambda_i \Lambda_i^T + \Psi_i$ accomplish the same? I understand that this is a dimensionality reduction technique since $\text{rank}(\Lambda_i) \leq d < D$. It'd be great if someone could help me understand more, and/or suggest resources I could look into for learning about this as an absolute beginner.

From what I understand, $x = \Lambda z + u$ is one factor-analyzer (right?), i.e. the generative model in maximum likelihood factor analysis. This paper goes on to define a mixture of factor-analyzers indexed by $\omega_j$, where $j = 1,...,m$. The generative model now obeys the distribution $$P(x) = \sum_{i=1}^m \int P(x|z,\omega_j)P(z|\omega_j)P(\omega_j)dz$$ where, $P(z|\omega_j) = P(z) = \mathcal{N}(0,I)$. How does this help/achieve the desired objective? Why take the sum from $1$ to $m$? Where is subspace clustering happening, and what's happening on a high-level when we are using this mixture of factor-analyzers?

stoic-santiago
  • 1,121
  • 5
  • 18

0 Answers0