4

A model can be roughly defined as any design that is able to solve an ML task. Examples of models are the neural network, decision tree, Markov network, etc.

A function can be defined as a set of ordered pairs with one-to-many mapping from a domain to co-domain/range.

What is the fundamental difference between them in formal terms?

hanugm
  • 3,571
  • 3
  • 18
  • 50

4 Answers4

5

A model as a set of functions

In some cases in machine learning, a model can be thought of as a set of functions, so here's the first difference.

For example, a neural network with an arbitrary vector of parameters $\theta \in \mathbb{R}^m$ is often denoted as a model, then a specific combination of these parameters represents a specific function. More specifically, suppose that we have a neural network with 2 inputs, 1 hidden neuron (with a ReLU activation function, denoted as $\phi$, that follows a linear combination of the inputs), and 1 output neuron (with a sigmoid activation function, $\sigma$). The inputs are connected to the only hidden unit and these connections have a real-valued weight. If we ignore biases, then there are 3 parameters, which can be grouped in the parameter vector $\theta = [\theta_1, \theta_2, \theta_3] \in \mathbb{R}^3 $. The set of functions that this neural network represents is defined as follows

$$ f(x_1, x_2) = \sigma (\theta_3 \phi(x_1 \theta_1 + x_2 \theta_2)) \tag{1}\label{1}, $$

In this case, the equation \ref{1} represents the model, given the parameter space $\Theta = \mathbb{R}^3$. For any specific values that $\theta_1, \theta_2,$ and $\theta_3$ can take, we have a specific (deterministic) function $f: \mathbb{R} \rightarrow [0, 1]$.

For instance, $\theta = [0.2, 10, 0.4]$ represents some specific function, namely

$$ f(x_1, x_2) = \sigma (0.4 \phi(x_1 0.2 + x_2 10.0)) \tag{2}\label{2} $$ You can plot this function (with Matplotlib) for some values of the inputs to see how it looks. Note that $x_1$ and $x_2$ can be arbitrary (because those are just the inputs, which I assumed to be real numbers).

This interpretation of a model is roughly equivalent to the definition of a hypothesis class (or space) in computational learning theory, which is essentially a set of functions. So, this definition of a model is useful to understand the universal approximation theorems for neural networks, which state that you can find a specific set of parameters such that you can approximately compute some given function arbitrarily well, given that some conditions are met.

This interpretation can also be applied to decision trees, HMM, RNNs, and all these ML models.

A model in reinforcement learning

The term model is also sometimes used to refer to a probability distribution, for example, in the context of reinforcement learning, where $p(s', r \mid s, a)$ is a probability distribution over the next state $s'$ and reward $r$ given the current state $s$ and action $a$ taken in that state $s$. Check this question for more details. A probability distribution could also be thought of as a (possibly infinitely large) set of functions, but it is not just a set of functions, because you can also sample from a probability distribution (i.e. there's some stochasticity associated with a probability distribution). So, a probability distribution can be considered a statistical model or can be used to represent it. Check this answer.

A function as a model

A specific function (e.g. the function in \ref{2}) can also be a model, in the sense that it models (or approximates) another function. In other words, a person may use the term model to refer to a function that attempts to approximate another function that you want to model/approximate/compute.

nbro
  • 39,006
  • 12
  • 98
  • 176
2

Any model can be considered to be a function. The term "model" simply denotes a function being used in a particular way, namely to approximate some other function of interest.

1

Every model is a function. Not every function is a model.

A function uniquely maps elements of some set to elements of another set, possibly the same set.

Every AI model is a function because they are implemented as computer programs and every computer program is a function uniquely mapping the combination of the sequence of bits in memory and storage at program start up, plus inputs, to the sequence of bits in memory and storage, plus output, at program termination.

However, a 'model' is very specifically a representation of something. Take the logistic curve:

$$ f(x) = \frac{L}{1 + e^{k(x-x_{0})} } $$

Given arbitrary real values for $L$, $k$, and $x_{0}$, that's a function. However, given much more specific values learned from data, it can be a model of population growth.

Similarly, a neural network with weights initialized to all zeros is a function, but a very uninteresting function with the rather limited codomain $\{0\}$. However, if you then train the network by feeding it a bunch of data until the weights converge to give predictions or actions roughly corresponding to some real world generating process, now you have a model of that generating process.

Adam Acosta
  • 111
  • 2
  • This is an interesting perspective of what a model and functions are, which is slightly different than what my answer says. Essentially, you're saying that a model is a function that describes (or indeed "models") some physical or not phenomenon and that every function that doesn't currently describe anything or is apparently useless is not a "model". In other words, according to you, functions that are not "interesting" (i.e. they don't explain anything that we can relate to in the real world or maybe in a virtual world, such as a game) are not models. – nbro Jan 21 '21 at 14:25
  • That's a bit an anthropomorphic definition of a model, i.e. it is relative to the human. I don't know if that's a good idea, but I understand your points. In any case, you should note that, if you had a neural network initialized to all zeros, it would probably not be able to learn anything. Read [this](https://ai.stackexchange.com/q/4320/2444). – nbro Jan 21 '21 at 14:28
0

In simple terms, a neural network model is a function approximator which tries to fit the curve of the hypothesis function. A function itself has an equation which will generate a fixed curve:

enter image description here

If we have the equation (i.e., the function), we do not need neural network for its input data. However, when we only have some notion of its curve (or the input and output data) we seek a function approximator, so that for new, unseen input data, we can generate the output.

Training this neural network is all about getting as close an approximation to the original (unknown function) as possible.

anurag
  • 151
  • 1
  • 7
  • "A function itself has an equation which will generate a fixed curve in the dimensions of its input space." I don't see any interpretation of that statement that is correct. A function can be considered to define a subset of the Cartesian product of its input and output space, but not of just its input space. – Acccumulation Dec 30 '20 at 23:51