I'm a software developer who keeps trying (and failing) to get my head around AI and neural networks. There is one area that sparked my interest recently - simulating a mouse "homing in" on a piece of cheese by following the smell. Based on the rule that moving closer to the cheese = stronger smell = good, then it feels like it should be quite a simple problem to solve - in theory at least!
My thought process was to start by placing the mouse and cheese in random positions on the screen. I would then move the mouse one step in a random direction and measure its distance to the cheese, and if it's closer than before (stronger smell) then that's good. This is where I come unstuck on the theory - this "feedback" somehow needs to modify the mechanism used to move the mouse, gradually refining it until the mouse is able to head straight towards the cheese. Once "trained", I should be able to reposition the cheese and expect the mouse to travel to it more quickly. Note I'm also keeping things simple by not having obstacles for the mouse to negotiate around.
How on earth would this be implemented with a NN? I understand the basic concepts, but I find that things unravel once I start looking at real code! The examples I've seen typically start by training the NN from a data set, but this doesn't seem to apply here as it feels like the only training available is "on the fly" as the mouse moves around (i.e. closer = good, further away = bad). I'm assuming the brain has some kind of "reward mechanism" triggered by a stronger smell of cheese.
Am I barking up the wrong tree - either with my thought process, or NN not being a good fit for this problem? This isn't homework btw, just something that I've been puzzling over in the back of my mind.