Assume that I have a candidate selection system to generate product/user pairs for recommendation. Currently, in order to hold a quality bar for the recommended product, we trained a model to optimize for the click of the link, denoting as pClick(product, user) model, the output of the model is a score of (0,1) representing how likely the user will click on the recommended product.
For our initial launch product, we set a manually selected threshold, say T for all users. For all users, only when the threshold pass T, we will send user the recommendation.
Now we realize this is not optimal: Some users care less about recommendation quality while some other users have a high bar of recommendation quality. And a personalized threshold, instead of the global T can help us improve the overall relevance.
The goal is to output the threshold for each user, assume we have training data for each user's activity and user/product attributes.
The question is: How should we model this problem with machine learning? Any reference or papers is highly appreciated.