0

I often wish to ask GPT to read a scientific paper with a lot of formulas but I run into a difficulty. Usually a scientific paper is in pdf format and there are a lot of formulas. If those formulas are in latex form, GPT can indeed understand them precisely which is evidenced by that it can write codes that precisely translate a formula (according to my experience). But what if, like in a pdf, the formulas are not in latex form? If I copy and paste them the formulas usually become non-readable. I know that one can use some tool like mathpix that ocr a picture into latex with quite high fidelity. But that is still quite inconvenient as there may be ocr-errors which still need significant human inspection to make sure to eliminate. But openai must have fed scientific textbooks with loads of formulas into GPT in the training process in the first place, I suppose? So I suspect there does exist a convenient way to feed GPT a scientific paper in pdf format with a lot of formulas. If so, how do one do that?

aystack
  • 111
  • 3

1 Answers1

1

There is an example "book_translation" in the OpenAI cookBook that automatically translates, chunk by chunk, a Math-book from the Slovenian language to the English language.

Maybe it helps to study the source code of that python notebook to solve your problem.

Translate a book writen in LaTeX from Slovenian into English
With permission of the author, we will demonstrate how to translate the book Euclidean Plane Geometry, written by Milan Mitrović from Slovenian into English, without modifying any of the LaTeX commands.
To achieve this, we will first split the book into chunks, each roughly a page long, then translate each chunk into English, and finally stitch them back together.

knb
  • 143
  • 1
  • 6