Unable to 'learn' a rotational angle by parametrising the angle as a neural network layer

Question

I'm trying to implement a neural network that can capture the drift in a measured angle as a way of dynamic calibration. i.e, I have a reference system that may change throughout the course of the data gathering and would like to train a network layer which actually converts the drifting reference to the desired reference by updating the angle parameter.

For example: Consider the 2d case. We would have a set of 2d points $X\in \mathbb{R}^2$ and a trainable parameter called $\theta$ in the layer. The output of the layer would then be: $$X_o = XR$$ where $$R = \begin{bmatrix} \cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta) \end{bmatrix}$$

Using Adam optimizer I then try to find the $\theta$ which transforms a given angle to the desired reference.

However, the $\theta$ value seems to fluctuate around the initial value probably because of a diverging gradient(?). How can I overcome this issue?

The code is below.

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt


class Rotation2D(tf.keras.layers.Layer):
  def __init__(self):
    super(Rotation2D, self).__init__()

  def build(self, input_shape):
    self.kernel = self.add_weight("kernel", initializer=tf.keras.initializers.Constant(90),
                                  shape=[1, 1])

  def call(self, input):
    matrix = ([[tf.cos(self.kernel[0, 0]), -tf.sin(self.kernel[0, 0])],
              [tf.sin(self.kernel[0, 0]), tf.cos(self.kernel[0, 0])]])
    return tf.matmul(input, tf.transpose(matrix))

layer = Rotation2D()

t = np.arange(0, 1000)/200.

y_in = np.array([np.sin(t), np.cos(t)]).T
y_ta = np.array([np.cos(t), np.sin(t)]).T

model = tf.keras.Sequential()
model.add(layer)

model.compile(tf.keras.optimizers.SGD(lr=1.), loss='MSE')
model.fit(y_in, y_ta, epochs=1)
for i in range(100):
  print(layer.get_weights())
  model.fit(y_in, y_ta,verbose=0, batch_size=5)
y_out = (model.predict(y_in))

fig, axes = plt.subplots(2, 1)

for i in range(2):
  ax = axes[i]

  ax.plot(y_in.T[i], label = 'input')
  ax.plot(y_ta.T[i], label = 'target')
  ax.plot(y_out.T[i], label = 'prediction')

plt.legend()

plt.show()```

score 0 · Accepted Answer · answered Apr 20 '21 at 14:46

There are two basic problems with your code:

The functions sin(t) and cos(t) (both in numpy and tensorflow) take radians as inputs. Seeing Constant(90) in your code, and the learning rate of 1. I'm guessing that you assume that it is in degrees - that's incorrect.

In your training data y_ta is not a rotation of y_in:

y_in = np.array([np.sin(t), np.cos(t)]).T
y_ta = np.array([np.cos(t), np.sin(t)]).T

It is a reflection about y=x diagonal. No wonder it fails to find an appropriate rotation.

I just had to change y_ta to:

  y_ta = np.array([-np.cos(t), np.sin(t)]).T

And train with more sensible learning rate:

  model.compile(tf.keras.optimizers.SGD(lr=0.1), loss='MSE')
  model.fit(y_in, y_ta, epochs=10)

To get the angle (which I then convert to degrees):

  (180 * model.layers[0].kernel[0,0].numpy() / np.pi) % 360
  > 90.00187079311581

Thanks and yes, I believe I messed up with the target value which was not actually a rotation. — damith219, Apr 21 '21 at 09:11

Unable to 'learn' a rotational angle by parametrising the angle as a neural network layer

1 Answers1

Linked