I am trying to generate a model that uses several physicochemical properties of a molecule (including number of atoms, number of rings, volume, etc.) to predict a numeric value $Y$. I would like to use PLS Regression, and I understand that standardization is very important here. I am programming in Python, using scikit-learn.
The type and range for the features varies. Some are int64 while others are floating point numbers. Some features generally have small (positive or negative) values, while others have a very large value. I have tried using various scalers (e.g. standard scaler, normalize, min-max scaler, etc.). Yet, the R2/Q2 are still low.
I have a few questions:
Is it possible that by scaling, some of the very important features lose their significance, and thus contribute less to explaining the variance of the response variable?
If yes, if I identify some important features (by expert knowledge), is it OK to scale other features but those? Or scale the important features only?
Some of the features, although not always correlated, have values that are in a similar range (e.g. 100-400), compared to others (e.g. -1 to 10). Is it possible to scale only a specific group of features that are within the same range?