Much of the meta-learning literature deals with the few-shot learning problem of using data from a diverse set of "source" tasks (the meta-dataset) in order to train a model that can quickly learn how to solve a new, previously unseen "target" task. Here quickly means that the training set for the target task is very small (i.e., one with only a few labeled samples from).
I am facing a "many-shot" meta-learning problem, in which the main limitation is not the amount of training data for the target task, but rather very stringent accuracy requirements. If I'm fine-tuning a "vanilla" pre-trained model on my target task (one which was trained using ERM on the entire meta-dataset of source tasks), I need a target dataset of ~10M samples to reach the required accuracy specifications. My goal is to reach the same, or nearly the same, accuracy, with 1-2 orders of magnitude fewer labeled training samples (so ~100K or 1M target samples). If it helps, I also have a nearly unlimited supply of unlabeled samples from the target task. To achieve this goal, I have of course a (labeled) meta-dataset of many similar tasks.
In other words, while few-shot meta-learning algorithms deal (informally speaking) with the "maxmin" problem of first minimizing the target dataset size (number of shots) then maximizing accuracy, I'm facing the "minmax" problem of first maximizing the accuracy then minimizing the dataset size without hurting the performance too much.
I was not able to find any references that deal with this "many-shot meta-learning" scenario. So my questions:
- Which meta-learning approaches would be well-suited to deal with this scenario?
- Do you know of references to works that have studied this scenario?