1

I'm trying to make a neural network in pytorch that picks the parameters of a nonlinear function, the radius and (x,y) center of a circle in the example below, based on a sample of values from the nonlinear function.

More concretely, the neural network trained in the code below takes as input 100 (x,y) points on a circle and outputs radius, x_center, y_center of the circle.

I don't consider this a very difficult problem, but the trained neural network doesn't work very well, as you can see from two example plots after the code. How can the code be improved?

And in case this informs your recommendation, the goal is not to fit circles, which no one needs a neural network to do. I'm trying to use a neural network to calculate 9 parameters in a nonlinear function taking a single real valued input and outputting a complex number f(t) -> a + b*sqrt(-1). The input into the neural network is 54 complex values, and the output is 9 parameter values. I am guaranteed that the 54 complex input values can always be well approximated by f(t) with an appropriately picked 9 parameters. The parameters can easily be guessed by a human because different parameters intuitively change the shape of the complex curve, but I've been unable to use a minimization math algorithm for curve fitting. The problem is there are a lot of local minima the minimization algorithms can encounter before reaching the global minimum. The goal of the neural network is to get a good guess of the 9 parameters at the global minimum for a minimization math algorithm to be close to the global minimum initially, and thus converge to the global minimum rather than get stuck at a local minima.

You probably guessed that I know a bit of math, but I don't know much about machine learning. I was able to pick it up pretty quickly because of my math background, but I am severely lacking in practical experience. I don't know what to do at this point other than randomly changing the number of samples on a circle, number of examples circles, adding more layers to the neural network, adding different types of layers to the neural network, changing the loss function, changing the learning rate, changing the optimizer, changing the loss function, et cetera, but I have no method to my madness.

Post Script
I've found someone who did almost what I need. This paper paired with this github repo used 1,000 samples in a set of 100,000 with 1% failure rate, so there's hope. I have to dig deeper for the innards of their neural network training.

import torch
import numpy as np
import math
import matplotlib.pyplot as plt

#circle parameterized by t, < x(t) , y(t) >
t_parameter = np.linspace(-math.pi, math.pi, 100)

#create random radius,(x,y) center or circle paired with points on circle evaluated at all t in t_parameter
examples = 1000
max_radius = 4
random_rxy = np.random.rand(examples,3)
input_list = []
for i in range(examples):
  r_rand = random_rxy.item(i,0) * max_radius
  x_rand = random_rxy.item(i,1) * 7 - 2 #-2 < x_rand < 5 
  y_rand = random_rxy.item(i,2) - 2 #-2 < y_rand < -1
  x_coordinates = [r_rand*math.cos(t) + x_rand for t in t_parameter]
  y_coordinates = [r_rand*math.sin(t) + y_rand for t in t_parameter]
  input_list.append(x_coordinates + y_coordinates)
input_tensor = torch.Tensor(input_list)
output_tensor = torch.Tensor(random_rxy)

print(input_tensor)
'''
tensor([[ x_0_0,   x_0_1,   ..., x_0_99,   y_0_0,   y_0_1,   ..., y_0_99   ],
        [ x_1_0,   x_1_1,   ..., x_1_99,   y_1_0,   y_1_1,   ..., y_1_99   ],
        [ x_2_0,   x_2_1,   ..., x_2_99,   y_2_0,   y_2_1,   ..., y_2_99   ],
        ...,
        [ x_997_0, x_997_1, ..., x_997_99, y_997_0, y_997_1, ..., y_997_99 ],
        [ x_998_0, x_998_1, ..., x_998_99, y_998_0, y_998_1, ..., y_998_99 ],
        [ x_999_0, x_999_1, ..., x_999_99, y_999_0, y_999_1, ..., y_999_99 ]])
'''
print(output_tensor) #radious, x circle center, y circle center
'''
tensor([[r_0,   x_0,   y_0  ],
        [r_1,   x_1,   y_1  ],
        [r_2,   x_2,   y_2  ],
        ...,
        [r_997, x_997, y_997],
        [r_998, x_998, y_998],
        [r_999, x_999, y_999]])
'''

#define model and loss function.
model = torch.nn.Sequential(
  torch.nn.Linear(200, 200),
  torch.nn.Tanh(),
  torch.nn.Tanh(),
  torch.nn.Linear(200, 3)
)
loss_fn = torch.nn.MSELoss(reduction='mean')

#train model
learning_rate = 1e-4
optimizer = torch.optim.Adagrad(model.parameters(), lr=learning_rate)
for t in range(10000):
  # Forward pass: compute predicted y by passing x to the model.
  output_pred = model(input_tensor)

  # Compute and print loss.
  loss = loss_fn(output_pred, output_tensor)
  if t % 100 == 99:
    print(t, loss.item())
    '''
    99   0.0337635762989521
    199  0.0285916980355978
    299  0.025961756706237793
    399  0.024196302518248558
    499  0.022839149460196495
    ...
    9799 0.004136151168495417
    9899 0.0040830159559845924
    9999 0.004030808340758085
    '''
  
  #typical procedure
  optimizer.zero_grad()
  loss.backward()
  optimizer.step()

print(output_tensor[0].tolist())
print(output_pred[0].tolist())
#[0.7722834348678589, 0.46600303053855896, 0.5080233812332153 ]
#[0.7921068072319031, 0.46946045756340027, 0.49222415685653687]

plt.xlabel('x')
plt.ylabel('y')
r_rand, x_rand, y_rand = output_tensor[0].tolist()
plt.scatter([r_rand*math.cos(t) + x_rand for t in t_parameter],[r_rand*math.sin(t) + y_rand for t in t_parameter],label="Measured Data")
r_rand, x_rand, y_rand = output_pred[0].tolist()
plt.scatter([r_rand*math.cos(t) + x_rand for t in t_parameter],[r_rand*math.sin(t) + y_rand for t in t_parameter],label="Fit Data")
plt.legend(loc='upper right')
plt.tight_layout()
plt.show()

enter image description here

enter image description here

mldichter
  • 11
  • 3
  • You would probably have better luck using an RNN, and that way you can feed an arbitrary number of inputs in and the network will slowly refine it's guesses of the parameters you want. You probably don't even need to use a complicated structure, as there aren't really any long term relationships in this problem. – Recessive May 12 '21 at 02:23
  • @Recessive I read the full PhD thesis of the author and he used a CNN. His rationale was a sufficient number of consecutive points effectively determines the entire curve, and then lots of sets of consecutive points to train on. For my toy circle problem, three consecutive points with exact coordinates can be used to solve for the radius and (x,y) center of the circle. I want to emphasize that I'm not saying a RNN would or would not perform better, as I have no idea, but I'll be trying out the CNN first since the paper author at least got that working. – mldichter May 12 '21 at 23:57
  • I gave up. I was able to get better results, but at least one of the radius, x center, y center values were off by at least 10% too often to be useful and some of the predicted values were way off. I doubt my circle problem is an unsolved problem, but I couldn't find relevant search results, and then most neural network results, and libraries, are geared toward image, audio, and text processing, which makes sense since that's where the money is. If anyone does find a solution, please let me know. That would help me, and others interested in numerical applications, know where to look. – mldichter May 15 '21 at 04:09

0 Answers0