Assuming we have big $m \times n$ input dataset, with $m \times 1$ output vector. It's a classification problem with only two possible values: either $1$ or $0$.
Now, the problem is that almost all elements of the output vector are $0$s with a very few $1$s (i.e. it's a sparse vector), such that if the neural network would "learn" to give always 0 as output, this would produce high accuracy, while I'm also interested in learning when the 1s occurs.
I thought one possible approach could be to write a custom loss function giving more weight to the 1s, but I'm not completely sure if this would be a good solution.
What kind of strategy can be applied to detect such outliers?