How to design a classifier while the patterns of positive data are changing rapidly?

Question

In some situation, like risk detection and spam detection. The pattern of Good User is stable, while the patterns of Attackers are changing rapidly. How can I make a model for that? Or which classifier/method should I use?

score 2 · Accepted Answer · answered Sep 14 '18 at 09:01

The phenomenon where the prediction targets (in your case, behaviour) change over time is referred to as "concept drift".

If you search for that term, you'll find that there have been many publications attempting to tackle that over multiple decades, way too many papers to all summarize here in a single answer. It's still a difficult problem though, by no means a "solved" problem.

Two different, broad directions for ideas are:

Frequently re-training (offline) static models on the most recent training data
Using online learning approaches that can continuously be updated from a data stream, online as new labelled data becomes available.

This github page contains a large list of papers on credit card fraud detection, where the problem you describe occurs because fraudsters change their behaviour in an attempt to evade detection. Most of those papers discuss variants of the first approach. Basically, many of those papers use an ensemble of multiple Random Forests. Every day, new labelled data becomes available. They often then remove the oldest of multiple Random Forests, and add a new Random Forest trained on the most recent data made available that day.

There are also some variants where they don't always train new models at a fixed schedule (e.g., every day), but try to detect when the statistical properties of the data have changed using statistical tests, and only train new models when it is "necessary" (due to such changes).

For the second idea, you'll often be thinking of approaches that use Stochastic Gradient Descent-like approaches for learning; with a non-decreasing learning rate / step size, such techniques will naturally, slowly "forget" what they have learned from old data, and focus more on the latter data.

If you have some method to obtain accurate labels for certain instances relatively quickly, you could consider an approach like the one proposed in this paper (disclaimer: I'm an author on this paper). For example, in that paper the assumption is that human experts can relatively quickly investigate and obtain accurate labels for a small selection of transactions, and this can be exploited to quickly learn in an online manner.

How to design a classifier while the patterns of positive data are changing rapidly?

1 Answers1