Is the VC dimension of a MLP regressor a valid upper bound on how many points it can exactly fit?

Question

I want to calculate an upper bound on how many training points an MLP regressor can fit with ~0 error. I don't care about the test error, I want to overfit as much as possible the (few) training points.

For example, for linear regression, it's impossible to achieve 0 MSE if the points don't lie on a line. An MLP, however, can overfit and include these points in the prediction.

So, my question is: given an MLP and its parameter, how can I calculate an upper bound on how many points it can exactly fit?

I was thinking to use the VC dimension to estimate this upper bound.
The VCdim is a metric for binary classification models but a pseudo dimension can be adapted to real-value regression models by thresholding the output:

$$Pdim(\mathcal{G}) = {VCdim}(\{(x,t) \mapsto 1_{g(x)-t>0}:g \in \mathcal{G}\})$$ (from the book Foundation of Machine Learning, 2nd edition, definition 11.5)

where $\mathcal{G}$ is the concept class of the regressor, $g(x)$ is the regressor out and $t$ is a threshold.

The model is an MLP with RELU activations. As far as I understood on Wikipedia, it should have a VCdim equal to the number of the weights (correct me if I'm wrong).

So, the question is: how to practically calculate the pseudo-dim for the regressor given the VCdim? Does it make sense for the purpose that I want to achieve?

This question was also asked [here](https://stats.stackexchange.com/q/553200). — nbro, Nov 24 '21 at 12:14

Is the VC dimension of a MLP regressor a valid upper bound on how many points it can exactly fit?

0 Answers0