Can AlexNet be changed to produce floating-point outputs in the range $[-1, 1]$, and, if not, which model should I use?

Question

I'm developing a game AI, which tries to master racing simulation. I already trained a CNN (AlexNet) on in-game footage of me playing the game and the pressed keys as the target.

I had two main issues with this setup:

Extracting the current speed from the speedometer, in order to feed it to the AI. That question was already solved here.
During testing, I noticed that the AI cannot make small adjustments on straight roads and cuts corners a lot.

I'm pretty sure that the second issue is caused by the game handling pressed keys binary (e.g. 'a' pressed -> 100% left turn). The AI just can't make precise movements.

In order to solve that issue, I want to emulate a joystick, which controls the game precisely. Using pygame, I already managed to capture my controller inputs as training data.
The controller has two axes, one for turning, the other one for throttle/breaking. Both axes can have any value between -1 and 1:

Axis 0: Value -1 -> 100% left
        Value +1 -> 100% right

Axis 1: Value -1 -> full break
        Value +1 -> full throttle

My goal is to train AlexNet to output analog raw axis values, given the current speed and captured frame, and feed its predictions into the joystick emulator.

I found someone on GitHub who tried something similar and couldn't achieve good results even on a modified AlexNet. Because of this, I was wondering whether it is even possible to modify the CNN to output analog values instead of using it as an image classifier.

My question is whether it is worth putting the effort into editing AlexNet, instead of using a whole different model.

I found some models online like the NVIDIA End-to-End Self Driving model, which sadly only controls the steering angle, and seems to be made for low-speed casual driving.

Can AlexNet be changed to produce floating-point outputs in the range $[-1, 1]$, and, if not, which model should I use?

0 Answers0