When using PCA for dimensionality reduction of the feature vectors to speed up learning, how do I know that I'm not letting the model overfit?

Question

I'm following Andrew Ng's course for Machine Learning and I just don't quite understand the following.

Using PCA to speed up learning

Using PCA to reduce the number of features, thus lowering the chances for overfitting

Looking at these two separately, they make perfect sense. But practically speaking, how am I going to know that, when my intention is to speed up learning, I'm not letting the model over-fit?

Do I've to find a middle-ground between these two scenarios when applying PCA? If so how exactly can I do that?

After reading your post more than once, I think I got your question, so I edited your post to clarify what your question seems to be and I put it in the title. Please, make sure that the question in the title is your question. — nbro, Feb 24 '21 at 11:58
Yes @nbro...that's exactly what I wanted to ask..Thanks for the edit :) — AfiJaabb, Feb 25 '21 at 12:27

SpiderRico · Answer 1 · 2021-02-22T16:45:53.590

I'm not sure if I understood your question correctly, but here's my take anyway.

So, PCA is a technique that you can apply to data to reduce the number of features. In return, (i) this can speed-up training, as there are less features to do computation with, (ii) and can prevent overfitting, as you lose some information on your data.

To detect overfitting, you usually monitor the validation and training losses during the training. If your training loss decreases, but your validation loss stays constant or increases, it's likely that your model is overfitting on the training data. In practice, this means your model generalizes worse and you can observe this by measuring the test accuracy.

All in all, you can apply PCA, train a new model, and measure your model's test accuracy to see if PCA has successfully prevented overfitting. In case it didn't, you can re-train with other regularization techniques such as weight decay and so on.

After your edit that put the slides

Basically, what slides claim is, PCA could be a bad way to prevent overfitting when compared to using standard regularization methods. To actually see whether this is the case, the standard way would be measuring your model's performance on a validation dataset. So, if PCA throws away lots of information, and hence causes your model to underfit, your validation accuracy should rather be poor wrt to using standard regularization techniques.

When using PCA for dimensionality reduction of the feature vectors to speed up learning, how do I know that I'm not letting the model overfit?

Using PCA to speed up learning

Using PCA to reduce the number of features, thus lowering the chances for overfitting

1 Answers1