I'd like to develop an MCTS-like (Monte Carlo Tree Search) algorithm for program induction, i.e. learning programs from examples.
My initial plan is for nodes to represent programs and for the search to expand nodes by revising programs. Each edge represents a possible revision.
Many expansions involve a single program: randomly resample a subtree of the program, replace a constant with a variable, etc. It's straightforward to use these with MCTS.
Some expansions, however, generate a program from scratch (e.g. sample a new program). Others use two or more programs to generate a single output program (e.g. crossover in Genetic Programming).
These latter types of moves seem nonstandard for vanilla MCTS, but I'm not deeply familiar with the literature or what technical terms might be most relevant.
How can I model expansions involving $N \ne 1$ nodes? Are there accepted methods for handling these situations?