2

In general, can ANNs have continuous inputs and outputs, or do they have to be discrete?

So, basically, I would like to have a mapping of continuous inputs to continuous outputs. Is this possible? Does this depend on the type of ANN?

More specifically, I would like to use neural networks for learning the Q-function in Reinforcement Learning. My problem has basically a continuous state and action space, and I would like to know whether I have to discretize it or not (a more detailed description of my problem can be found here).

nbro
  • 39,006
  • 12
  • 98
  • 176
PeterBe
  • 212
  • 1
  • 11

1 Answers1

6

Neural networks normally work in continuous spaces. A typical neural network function could be written as $f(\mathbf{x}, \mathbf{\theta}): \mathbb{R}^N \rightarrow \mathbb{R}^M$. That is, a function of some $N$ dimensional input vector of real numbers $\mathbf{x}$ that outputs some $M$ dimensional output vector of real numbers (which you could call $\mathbf{y}$) and that is parametrised by a vector of weights and biases $\mathbf{\theta}$.

When you train a neural network with some data $\mathbf{X}, \mathbf{Y}$ then you are trying to adjust $\mathbf{\theta}$ so that $f(\mathbf{x}, \mathbf{\theta})$ approximates some assumed mapping function $g(\mathbf{x}): \mathcal{X} \rightarrow \mathcal{Y}$ that provides a "true" mapping between all possible inputs $\mathcal{X}$ and the matching correct values from all possible outputs $\mathcal{Y}$. In a neural network, then $\mathcal{X}$ and $\mathcal{Y}$ must be sets of vectors of real numbers.

So, yes all neural networks map between continuous values by default*. When NNs are used to work with discrete values, then the discrete variables need to be converted to real-valued variables in order to be usable with a NN. So discrete inputs might be mapped to a small set of different real values or a vector of $\{0, 1 \}$ (called "one hot" encoding). Discrete outputs are often mapped to a probability vector giving the confidence that the learned function assigns to each possible discrete output - this is the usual case in classifiers.

You have a specific use case in mind:

More specifically, I would like to use neural networks for learning the Q-function in Reinforcement Learning. My problem has basically a continuous state and action space, and I would like to know whether I have to discretize it or not

For an action value, or Q function in reinforcement learning, then you do not need to be concerned that state space or action space are continuous. However, there are some details that may be important:

  • Neural networks learn most efficiently when input elements are within a unit-like distribution such as $\mathcal{N}(0,1)$ (normal distribution with mean $0$, standard deviation $1$). This does not have to be precise, but you should take care to scale the input values so that each element has a typical magnitude around $1$.

  • The output values you want to learn need to be within the domain of the output layer's activation function. It is common to use linear (or "no" activation function) in the output layer for regression problems for this reason, and learning the action value Q associated with a state, action pair is a regression problem. So unless you have good reason to do otherwise, use a linear activation function on the output layer.

  • Control problems in continuous action spaces cannot be solved using only a learned action value function. That is because, in order to derive a greedy policy from a Q function, you need to solve $\pi(s) = \text{argmax}_aQ(s,a)$ and finding the maximum value in a continuous action space is not practical. If your Q network is part of an agent solving a control problem - i.e. finding the optimal policy - then you will need some other way to generate and improve the policy. This is usually done using a policy gradient method.

Due to this last point, it is sometimes advisable to discretise the action space if you want to use a simpler value-based RL method, such as DQN.


* Caveats:

  • There are a few architectures which really do use discrete inputs, weights and/or outputs. These are either historical variants or specialist though, you will not find any mainstream libraries that do this by default.

  • Computers model real numbers as floating point which are technically discrete due to practical limitations of CPUs. Sometimes it is important to know this, but it will not impact modelling a Q function.

Neil Slater
  • 28,678
  • 3
  • 38
  • 60
  • Thanks Neil for your great answer (I had alreay upvoted and accepted it). One follow up question to your comment "Due to this last point, it is sometimes advisable to discretise the action space if you want to use a simpler value-based RL method, such as DQN."--> Basically I do want to use DQN (maybe also SARSA afterwards) and i would like to know whether it is also advisable to discretize the state space and not only the action space. My state space is quite large. I have about 10 variables that are all continious from min 0 to max 10,000. – PeterBe Aug 23 '21 at 14:05
  • 1
    @PeterBe You do not need to discretise the state space for DQN. Do remember to scale the state values for use with a neural network (e.g. scale each value from -1.5 to 1.5 is reasonable) – Neil Slater Aug 23 '21 at 14:09
  • Thanks Neil for your anwer. I appreciate it. By scaling you mean to normalize or standardize each variable, right? – PeterBe Aug 23 '21 at 14:27
  • 1
    @PeterBe Yes, you should scale to fit each vraiable into a standard range – Neil Slater Aug 23 '21 at 14:50
  • Hi Neil, thanks for your answers. I appreciate it. Basically I don't really understand why I should not discretize the state space but the action space. The ANN just maps inputs to outputs. When used in a RL environment the ANN maps state/action pairs to a reward as far as I understood. Surely, if you have less state/action pairs (which you will have if you dicretize the state space) the mapping of the ANN is not as challenging which will most propably lead to better results for the Q-learning algorithm – PeterBe Aug 23 '21 at 16:36
  • 1
    @PeterBe: My answer spells that out. The discretisation is required to support max and argmax operations over the action space. You don't need to do that over the state space. Nothing stops you if you want to though. – Neil Slater Aug 23 '21 at 16:38