1

In my attempt at trying to learn neural network and machine learning I'm am trying to create a simple neural network which can be trained to recognise one word from a given string (which contains only one word). So in effect if one where to feed it a string containing the trained word but spelled wrong the network would be able to still recognise the word. Can anybody help me with some pseudo code or a start of a code. Or a general explanation of how to to this because I have read like 6 articles and 8 example projects and still have no clue how to do this

Amit Hendin
  • 23
  • 1
  • 5
  • [OCR is not perceived as a good example of AI](http://ai.stackexchange.com/q/1396/8), so your questions is not really about AI. – kenorb Sep 20 '16 at 17:49
  • I know but I didn't find a proper stacked exchange to put this in – Amit Hendin Sep 20 '16 at 18:38
  • Welcome to AI.SE! Unfortunately, implementation questions are off-topic here, but if you have questions about a social or conceptual aspect of AI, we can help! (For example, a part of the articles you read that you didn't understand could be turned into something interesting.) You might be able to ask some form of this question on Stack Overflow after you do some more research and have a more specific problem statement. – Ben N Sep 20 '16 at 22:18
  • If you maybe focus on how ocr recognition works using neural network, I think it would be fine. – kenorb Sep 20 '16 at 22:26

2 Answers2

2

If I'm reading it correctly, this question has nothing to do with optical character recognition. You want to create a system that takes a digital string of characters as input, then finds the best match from a predetermined list of words. That sounds like a task for if-then-else logic and dictionary lookup. It might be possible to use a neural net, but not easy.

A neural net takes a fixed number of inputs, each of which are a value between zero and one. A major hurdle is that you probably want variable-sized inputs. Another hurdle is that you'll need to code the inputs some way onto numbers.

These hurdles can be overcome but they are tipoffs that neural networks aren't well-suited for the task.

James M
  • 21
  • 2
1

An optimal solution for the task as stated, would be some alignment algorithm like Smith-Waterman, with a matrix which encodes typical typo frequencies.

As an exercise in NNs, I would recommend using a RNN. This circumvents the problem that your inputs will be of variable size, because you just feed one letter after another and get an output once you feed the delimiter.

As trainingsdata you'll need a list of random words and possibly a list of random strings, as negative examples and a list of slightly messed up versions of your target word as positive examples.

Here is a minimal character-level RNN, which consists of only a little more than a hundred lines of code, so you might be able to get your head around it or at least get it to run. Here is the excellent blog post by Karpathy to which the code sample belongs.

BlindKungFuMaster
  • 4,185
  • 11
  • 23