3

I have recently gone about and made a simple AI, one that gives responses to an input (albeit completely irrelevant and nonsensical ones), using Synaptic.js. Unfortunately, this is not the type of text generation I am looking for. What I am looking for would be a way to get connections between words and generate text from that. (What would be preferable would be to also generate at least semi-sensible answers also.)

This is part of project Raphiel, and can be checked out in the room associated with this site. What I want to know is what layer combination would I use for text generation?

I have been told to avoid retrieval-based bots.

I have the method to send and receive messages, I just need to figure out what combination of layers would be the best.

Unless I have the numbers wrong, this will be SE's second NN chatbot.

FreezePhoenix
  • 422
  • 3
  • 20

1 Answers1

2

This seems like a problem for the use of an encoder-decoder pair such as those seen in text summarization (see this paper by Rush et al.: https://arxiv.org/pdf/1509.00685.pdfï%C2%BC).

You would need the following layers:

  • LSTM layer to encode the given input text into an embedding

  • LSTM layer the looks over the currently generated output to encode the text into an embedding

  • A dense soft-max layer for generating words probabilistically based on the output of the two contextual LSTM encoders

Please see the following blog post by Jason Brownlee that outlines this approach and others, while giving implementation details and snippets (https://machinelearningmastery.com/encoder-decoder-models-text-summarization-keras/)!

Also note that this would require a large set of training examples of input text and reasonable responses. You might be able to scrape Reddit post responses and comments off of those for a start? Let me know if I misunderstood the question.

JMed
  • 76
  • 3
  • You did not misunderstand the question, but I am using JS, and Synaptic.js at that. The documentation is [here](https://github.com/cazala/synaptic/wiki). I am unable to use Keras, for I am not using Python. – FreezePhoenix Apr 06 '18 at 11:49
  • Wait.... That image looks familiar: https://3qeqpr26caki16dnhd19sv6by6v-wpengine.netdna-ssl.com/wp-content/uploads/2017/09/General-Text-Summarization-Model-in-Keras.png – FreezePhoenix Apr 06 '18 at 12:03
  • I think I have it down. Just one more question: Does the embedding layer mean that it turns text into something the NN can understand, such as floating point numbers, while the output layer turns that back into text? – FreezePhoenix Apr 06 '18 at 12:05
  • @Pheo, yes you are correct, it turns each word into a stream of numbers, often times a vector. This is what Word2Vec and GloVe does, which you can load into your model in python. And then the last layer looks at these and can choose the output in a variety of ways, these are known as decoders. I apologize I'm not very familiar with JS, but if you have any conceptual questions I should be able to help out. – JMed Apr 06 '18 at 15:52
  • Conceptual question: Is the vector an array of indexes or an array representation of the Unicode values? – FreezePhoenix Apr 06 '18 at 17:47
  • Neither, it actually is a vector representing the symantic meaning of a word. That's why many people choose to make their own embedding through training or use Google's Word2Vec or GloVe since they've already been trained over a large set – JMed Apr 07 '18 at 19:42