I'm reading this paper and it says:
In this paper, we present a multi-class embedded feature selection method called as sparse optimal scoring with adjustment (SOSA), which is capable of addressing the data heterogeneity issue. We propose to perform feature selection on the adjusted data obtained by estimating and removing the unknown data heterogeneity from original data. Our feature selection is formulated as a sparse optimal scoring problem by imposing $\ell_{2, 1}$-norm regularization on the coefficient matrix which hence can be solved effectively by proximal gradient algorithm. This allows our method can well handle the multi-class feature selection and classification simultaneously for heterogenous data
What is the $\ell_{2, 1}$ norm regularization? Is it L1 regularization or L2 regularization?