I'm trying to implement a neural network that can capture the drift in a measured angle as a way of dynamic calibration. i.e, I have a reference system that may change throughout the course of the data gathering and would like to train a network layer which actually converts the drifting reference to the desired reference by updating the angle parameter.
For example: Consider the 2d case. We would have a set of 2d points $X\in \mathbb{R}^2$ and a trainable parameter called $\theta$ in the layer. The output of the layer would then be: $$X_o = XR$$ where $$R = \begin{bmatrix} \cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta) \end{bmatrix}$$
Using Adam optimizer I then try to find the $\theta$ which transforms a given angle to the desired reference.
However, the $\theta$ value seems to fluctuate around the initial value probably because of a diverging gradient(?). How can I overcome this issue?
The code is below.
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
class Rotation2D(tf.keras.layers.Layer):
def __init__(self):
super(Rotation2D, self).__init__()
def build(self, input_shape):
self.kernel = self.add_weight("kernel", initializer=tf.keras.initializers.Constant(90),
shape=[1, 1])
def call(self, input):
matrix = ([[tf.cos(self.kernel[0, 0]), -tf.sin(self.kernel[0, 0])],
[tf.sin(self.kernel[0, 0]), tf.cos(self.kernel[0, 0])]])
return tf.matmul(input, tf.transpose(matrix))
layer = Rotation2D()
t = np.arange(0, 1000)/200.
y_in = np.array([np.sin(t), np.cos(t)]).T
y_ta = np.array([np.cos(t), np.sin(t)]).T
model = tf.keras.Sequential()
model.add(layer)
model.compile(tf.keras.optimizers.SGD(lr=1.), loss='MSE')
model.fit(y_in, y_ta, epochs=1)
for i in range(100):
print(layer.get_weights())
model.fit(y_in, y_ta,verbose=0, batch_size=5)
y_out = (model.predict(y_in))
fig, axes = plt.subplots(2, 1)
for i in range(2):
ax = axes[i]
ax.plot(y_in.T[i], label = 'input')
ax.plot(y_ta.T[i], label = 'target')
ax.plot(y_out.T[i], label = 'prediction')
plt.legend()
plt.show()```