So I basically have a $n$ classes. I have an input. My data is organised in the following way: each input has a label, this label is 2 classes. It can be twice the same class, or two different classes.
So I want to output a vector with only 0s, 1s and 2s which sum is 2. Similarly to what we would do if the label is only one class in a regular classification model. I'm just a bit confused about how we would handle the standard the prediction is the argmax because here we would have to make arbitrary rules about when to predict 2 classes and when to predict two distinct classes (like in this case [0, 0.5, 0.4, 1.1, 0, 0]
, it's a bit hard to determine if class 4
is predicted twice or if it's one time class 2
and one time class 4
, etc ...)
Is this a thing, is there a scientific term about it? Am I not supposed to do that because it breaks something fundamental about classifiers? I'm planning on training Neural Networks, XGBoost, Random Forest, Extremely Randomized Trees and comparing the different methods (are there other methods where this would be possible? basically predicting a vector whose sum is 2).