How matrix factorization helps with recommendations when it converges to the initial user-items matrix?

Question

We can say that matrix factorization of a matrix $R$, in general, is finding two matrices $P$ and $Q$ such that $R \approx P.Q^{T}$ with some constraints on $P$ and $Q$. Looking at some matrix factorization algorithms on the internet like Scikit-Learn's Non-Negative Matrix Factorization I come to wonder how this works for recommendation systems. Generally with recommendation systems we have a user-item ratings matrix, let's denote it $R$, which is really sparse so when we look at datasets we find missing values, $NaN$. When I look at examples of using matrix factorization for recommender systems I find that the missing values are replaces with $0$. My question is, how do we get actual predictions on the items non rated by users when the dot product $P.Q^{T}$ is supposed to converge to $R$?

I have tried with this simple matrix that I found here

R = [
     [5,3,0,1],
     [4,0,0,1],
     [1,1,0,5],
     [1,0,0,4],
     [0,1,5,4],
    ]
R = np.array(R)

The algorithm I used is Scikit-Learn's and no matter how I change the parameters, I can't seem to find a matrix that has actual values in place of $0$s. It always finds a really good approximation of $R$. Maybe all the hyperparameter tuning I'm doing is leading to overfitting, and let's suppose there is a set of combination of parameters for which we don't have $0$s and still we minimize $||R-P.Q^{T}||$ with regard to some norm to a decent level, how can we be sure that the predictions are accurate? I mean, there must be many different combinations of parameters that ensure both prediction different values for the $0$s and minimizing $||R-P.Q^{T}||$ to a decent level.

Thank you!

How matrix factorization helps with recommendations when it converges to the initial user-items matrix?

0 Answers0