How can the agent be defined in a reinforcement learning problem with a tabular dataset as the environment?

Question

Let's assume we need to train an RL model that drops duplicates in a tabular dataset? The actions should probably defined as drop or do nothing.

But what should be the agent itself then? To me, it doesn't make sense to see it just as a navigator looping over the states (dataset indices from the first to the last) and decide on which to drop.

To me the approach of using RL for such a problem sounds a bit strange. RL is commonly applied to episodic control tasks. So each state $s_{t}$ usually has something to do with the state $s_{t-1}$ before that. This of course can be a property of tabular data as well. For identifying duplicates that are 100% similar you probably wouldn't apply ML, right? And for finding duplicates based on some similarity you could train an encoder model and compute some distance between any two samples in the latent space to find duplicates. — Chillston, Mar 21 '22 at 20:17
Duplicate dropping is just an assumption to give an example how an RL model can be used with tabular data, because it's the simplest case I could think of. It's not the actual matter if you read the question carefully, even pandas can do it by one line though. — aby, Mar 24 '22 at 15:02

How can the agent be defined in a reinforcement learning problem with a tabular dataset as the environment?

0 Answers0