3

I'm reading this paper and it says:

In this paper, we present a multi-class embedded feature selection method called as sparse optimal scoring with adjustment (SOSA), which is capable of addressing the data heterogeneity issue. We propose to perform feature selection on the adjusted data obtained by estimating and removing the unknown data heterogeneity from original data. Our feature selection is formulated as a sparse optimal scoring problem by imposing $\ell_{2, 1}$-norm regularization on the coefficient matrix which hence can be solved effectively by proximal gradient algorithm. This allows our method can well handle the multi-class feature selection and classification simultaneously for heterogenous data

What is the $\ell_{2, 1}$ norm regularization? Is it L1 regularization or L2 regularization?

Gyntonic
  • 133
  • 1
  • 5
  • 3
    It's a matrix norm. You have it [here](https://en.wikipedia.org/wiki/Matrix_norm#L2,1_and_Lp,q_norms). – Brale Dec 30 '19 at 22:13

1 Answers1

5

$\ell_{2,1}$ is a matrix norm, as stated in this paper. For a certain matrix $A \in \mathbb{R}^{r\times c}$, we have $$\|A\|_{2,1} = \sum_{i=1}^r \sqrt{\sum_{j=1}^c A_{ij}^2}$$ You first apply $\ell_2$ norm along the columns to obtain a vector with r dimensions. Then, you apply $l_1$ norm to that vector to obtain a real number. You can generalize this notation to every norm $\ell_{p,q}$.