28

Given a neural network $f$ that takes as input $n$ data points: $x_1, \dots, x_n$. We say $f$ is permutation invariant if

$$f(x_1 ... x_n) = f(\sigma(x_1 ... x_n))$$

for any permutation $\sigma$.

How could we build such a neural network? The output can be anything (e.g. a number or vector), as long as it does not depend on the order of inputs. Is there any research work on these types of neural networks?

nbro
  • 39,006
  • 12
  • 98
  • 176
Josef Ondrej
  • 395
  • 1
  • 3
  • 5
  • 1
    Interesting question. You want to treat you inputs as a set rather than a vector. Assuming your inputs are scalar or otherwise comparable, did you consider sorting your inputs to create a permutation-invariant canonical representation and feeding that to a regular network? – mjul Dec 04 '17 at 15:15
  • 1
    @mjul My inputs are series of bets of clients (each bet is represented by a vector of some cathegorical and continuous variables). I could order them chronologically for example, but as the time spacing between bets for each client is very different, this wouldn't make much sense. This could probably be solved using some function (fixed or learned) of time that would decay the coefficients for each bet. But I think the ordering really doesn't matter in this case so I wanted to try unordered data first, which obviously requires to treat the bets for each client symmetrically. – Josef Ondrej Dec 05 '17 at 08:33

4 Answers4

14

Here is a few that might be what you are looking for:

elgehelge
  • 241
  • 2
  • 5
13

Traditionally, due to the way the network is structured, each input has a set of weights, that are connected to more inputs. If the inputs switch, the output will too.

Approach 1

However, you can build a network that approaches this behaviour. In your training set, use batch learning and for each training sample, give all possible permutations to the network such that it learns to be permutation invariant. This will never be exactly invariant, it just might be close.

Approach 2

Another way to do this is to have the weights replicated for all inputs. For example, let's assume you have 3 inputs ($i_0, i_1, i_2$), and the next/first hidden layer has 2 nodes ($h_{10}, h_{11}$) and activation function $\phi$. Assuming a fully connected layer, you have 2 weights $w_0$ and $w_1$. The hidden layer's nodes $h_{10}$ and $h_{11}$ are given, respectively, by

  • $h_{10} = \phi(i_0 w_0 + i_1 w_0 + i_2 w_0)$

  • $h_{11} = \phi(i_0 w_1 + i_1 w_1 + i_2 w_1)$

Thus giving you a hidden layer whose values are permutation invariant from the input. From now on, you can learn and build the rest of the network as you see fit. This is an approach derived from convolutional layers.

Approach 3

Finally, you can use a dimension reduction to achieve this. I've published this in Exploring Communication Protocols and Centralized Critics in Multi-Agent Deep Learning, on Integrated Computer-Aided Engineering. This approach uses convolutional architectures to efficiently achieve permutation invariance.

nbro
  • 39,006
  • 12
  • 98
  • 176
BlueMoon93
  • 906
  • 5
  • 16
  • 2
    The first approach suggested would be infeasible in my case due to the computational complexity. The second, on the other hand, seems maybe too restrictive. But it is certainly a good start. What I have come up with so far is an approach similar to the one I found in this paper: https://arxiv.org/pdf/1612.04530.pdf. First, I consider all pairs (generally all k-tuples) of inputs $x_i, x_j$, and apply some neural network on all of them (the same NN on each pair). This gives me $n^2$ outputs $f(x_i, x_j)$, and then I average them (or take maximum) and apply another NN over the result. – Josef Ondrej Dec 11 '17 at 20:16
  • This is what I have come up with so far: https://github.com/josefondrej/Symmetric-Layers – Josef Ondrej Jan 31 '18 at 14:02
5

I have implemented Permutational Layer here using Keras: https://github.com/off99555/superkeras/blob/master/permutational_layer.py

You can call the PermutationalModule function to use it.

Implemented following this paper: Permutation-equivariant neural networks applied to dynamics prediction.

The idea is to compare all pairs of $N^2$ pairs from $N$ inputs, use the model with shared weights, then use pooling function $N$ times on $N$ inputs.

The output you can use pooling again but in the paper, they don't mention another pooling.

nbro
  • 39,006
  • 12
  • 98
  • 176
off99555
  • 325
  • 3
  • 12
  • Interesting. Without looking into details, I wonder why the computation is $N^2$ instead of $N!$. There are $N!$ possible permutations in the symmetric group. – Yan King Yin May 17 '20 at 00:24
  • 1
    Because this is not a complete permutation. It's very different from what you think. So you better look at the details from the paper. – off99555 May 17 '20 at 03:21
2

So, a practical application of this with a lot of research is in the deep lidar processing community. In that world, you have to do a lot of computation on point clouds which are completely unordered. One of the seminal works in this field is Point Net (https://arxiv.org/pdf/1612.00593.pdf) which solves this problem by performing all symmetric operations across channels. 1D convolutions, max pool, etc. Any network in this style that does not project to a 2D representation has to adhere to this rule.

juicedatom
  • 527
  • 3
  • 10