2

I do PCA on the data points placed in the corners of a hexagon, and get the following principal components:

enter image description here

The PCA variance is $0.6$ and is the same for each component. Why is that? Shouldn't it be greater in the horizontal direction than in the vertical direction? The data is between $-1$ and $1$ in the $x$-direction but only between $-\sqrt{3}/2$ and $\sqrt{3}/2$ in the $y$-direction. Why PCA results in the equal length components?

The length of each vector in the picture is the twice the square root of the variance.

UPDATE: added more points, the variances changed to $0.477$ but still they are equal.

enter image description here

UPDATE 2: Added even more points, the variances changed to $0.44$ but still they are equal.

enter image description here

1 Answers1

3

Assuming that the $6$ vertices of the hexagon are on the unit circle,

>>> from sympy import *
>>> A = Matrix([[ 1, Rational(1,2),-Rational(1,2), -1, -Rational(1,2), Rational(1,2)], 
                [ 0,     sqrt(3)/2,     sqrt(3)/2,  0,     -sqrt(3)/2,    -sqrt(3)/2]])
>>> A * A.T
Matrix([[3, 0],
        [0, 3]])

Since ${\bf A} {\bf A}^\top - 3 \, {\bf I}_2 = {\bf O}_2$, any two orthogonal directions could be the principal components.

  • The 6 vertices of the hexagon are on the unit circle but the points on the second picture are not on the unit circle, yet for them $AA^T$ is also diagonal and is `Matrix([[5.25, 0], [0, 5.25]])` with equal length diagonal elements. I wanted to use PCA to find direction of the largest variation of the data. Does it mean that PCA is not always find these directions even if data are not spherically symmetric? Also thanks for editing my question. – Vladislav Gladkikh Mar 30 '21 at 00:30
  • @VladislavGladkikh Rotate the point cloud by certain angles, and you obtain a point cloud that looks just like the one you started with, right? In my opinion, in these two cases, there's too much symmetry for a single "direction of the largest variation" to exist. At least in classical PCA. What if you use other norms? – Rodrigo de Azevedo Mar 30 '21 at 09:05