3

I want to understand automatic Neural Architecture Search (NAS). I read already multiple papers, but I cannot figure out what the actual search space of NAS is / how are classical hyper-parameters considered in NAS?

My understanding:

NAS aims to find a good performing model in the search space of all possible model architectures using a certain search- and performance estimation strategy. There are architecture-specific hyper-parameters (in the most simple feed-forward network case) like the number of hidden layers, the number of hidden neurons per layer as well as the type of activation function per neuron There are classical hyper-parameters like learning rate, dropout rate, etc. What I don't understand is:

What exactly is part of the model architecture as defined above? Is it only the architecture-specific hyper-parameters or also the classical hyper-parameters? In other words, what is spanning the search space in NAS: Only the architecture-specific hyper-parameters or also the classical hyper-parameters?

In case only the architecture-specific hyper-parameters are part of the NAS search space, what about the classical hyper-parameters? A certain architecture (with a fixed configuration of the architecture-specific hyper-parameters) might perform better or worse depending on the classical hyper-parameters - so not taking into account the classical hyper-parameters in the NAS search space might result in a non-optimal ultimate model architecture, or not?

nbro
  • 39,006
  • 12
  • 98
  • 176
cocojambo
  • 73
  • 3

0 Answers0