3

I'm still very new to this stuff. I have close to 2TB worth of data hoarded from IRC chats to everyday chats with friends and family.

But is there a way to pass in this much data into GPT to ask questions about it? Or would I require something else?

For example:

"When did Bob tell Jane about the legos he had in school when they were at home?"

  • 3
    Please also consider the privacy implication of this huge stash of personal data. It is not only your personal data but also those of your friends and family. Do you want to only use to learn about data-processing and natural language models or do you want to do something with the results, maybe even publish them in some way? – quarague Apr 29 '23 at 06:17

1 Answers1

3

Two approaches that I am aware of:

  • Chat your data
    This GitHub repository is accompanied by a blog post on how it works schematically. The overall approach is based on the LangChain library.

  • Azure Search OpenAI demo This approach goes over your own data using the Retrieval Augmented Generation pattern. It uses Azure OpenAI Service to access the ChatGPT model (gpt-35-turbo), and Azure Cognitive Search for data indexing and retrieval. Careful, the Azure costs might be substantial for 2TB of data.