I have the model which has 3 outputs (it is a regression task, I have the angle of the steering wheel, brake and acceleration). I can divide my values to some smaller bins and in this way I can change this into classification problem. I can balance data to have the same number of data points in each bin.
But now I wonder how to balance this data correctly.
I found some good resources and libraries
imbalanced-learn | Python official documentation
multi-imbalance | Python official documentation
Multi-imbalance | Poznan University of Technology
But to my understanding, these algorithms can deal with imbalanced data (in normal and multi class classification) only if you have one output. But I have 3 outputs. And these outputs can be correlated somehow. How to balance them correctly?
I thought about 2 ideas:
Creating tuples consist of 3 elements and balancing in such a way that you have the same number of different tuples But you can have this situation: (A, X, 1), (A, Y, 2), (A, Y, 3), (B, Z, 3) These tuples are different, but you can see that we have a lot of tuples with the value A at first position. So the data is still quite imbalanced.
Balancing data iteratively considering only one column at a time. You balance first column, then you balance second column etc.
Are these ideas good or not? Maybe there are some other options for balancing data if you have multiple targets?