Feature Selection using Monte Carlo Tree Search

Question

I'm trying to tackle the problem of feature selection as an RL problem, inspired by the paper Feature Selection as a One-Player Game. I know Monte-Carlo tree search (MCTS) is hardly RL.

So, I used MCTS for this problem, where nodes are subsets of features and edges are ("adding a feature to the subset") and it does converge to the optimal subset slowly.

I have a few questions

Is there a clever way to speed up the convergence of MCTS, besides parallelizing the rollout phase?
Adding nodes to the tree takes time and memory for datasets with a large number of features, for 10000 features, it takes up all my RAM (8GB) from the second iteration, (although it runs for 2000+ iterations for a dataset with 40 features which doesn't make sense to me). Is this expected or is my implementation likely wrong? Are there any workarounds for this?
What are your opinions on using MCTS for this task? Can you think of a better approach? The main problem of this approach is running the SVM as an evaluation function which may make the algorithm impractically slow for large datasets (a large number of training examples).

I was thinking of trying to come up with a heuristic function to evaluate subsets instead of the SVM. But I'm kind of lost and don't know how to do that. Any help would be really appreciated.

Feature Selection using Monte Carlo Tree Search

0 Answers0