What loss function is most appropriate when training a model with target values that are probabilities? For example, I have a 3-output model. I want to train it with a feature vector $x=[x_1, x_2, \dots, x_N]$ and a target $y=[0.2, 0.3, 0.5]$.
It seems like something like cross-entropy doesn't make sense here since it assumes that a single target is the correct label.
Would something like MSE (after applying softmax) make sense, or is there a better loss function?