0

I'm trying to find some text within PDF files, but the results are not accurate! For exemple I have 2 PDF files which have the word domiciiado. When I run a search for this word (domiciliado), docfetcher shows only ONE PDF file with this word. My question is why docfetcher doesn't show the other PDF file with this word? Is there a difference between PDF files? In one PDF I have only text and the other PDFs are texts and images and this is from a scanned page. What is the catch?

P.S.: the 2 PDF files are in the same directory

edwinksl
  • 23,789

1 Answers1

3

Is there any difference between PDF files with only text and PDF files with texts and images scanned pages?

Yes, PDF files with text and PDF files with scanned images are different. In Image based pdf, the computer only sees images and recognizing texts within these images requires extra capabilities be built into the PDF engine, such as Optical Character Recognition (OCR). The PDFs with text are easier for computer to search because computer can recognize text directly.

Recommendation

Anwar
  • 76,649