Questions tagged [ocr]

Optical Character Recognition, the process of converting printed or handwritten text or images of text into digitally encoded text on a computer (so that, for example, it can be reproduced, machine-translated, reformatted, edited, distributed, used as input to software such as text-to-speech and so on)

67 questions

votes

1 answer

How can I specify the language to be used by Tesseract when using OCRFeeder

I'm using the OCR-utility of OCRFeeder. OCRFeeder is using the tesseract-engine. I have installed the several language-packs needed for tesseract. How can I set the language such that tesseract will use the right language-file for converting the…

ocr

asked Feb 10 '11 at 18:44

Bernard Decock

votes

1 answer

How I prevent hocr2pdf to use a large font from tesseract generated .hocr file?

Tesseract now creates an .hocr file rather than an .html file for ocr output, but this is not exactly what is at issue here. When hocr2pdf uses this output it uses a large text size with small bounding boxes since the upgrade. Most of the text…

ocr

asked Jul 02 '14 at 19:23

user299889

votes

1 answer

pdfsandwich - how to not change page colour

I am using pdfsandwich but it changes the colour of the pages from colour to black and white. Since I have a document with many coloured pictures how can I avoid it?

ocr

asked Dec 04 '13 at 19:46

brasileiro

votes

2 answers

Optical character recognition for LibreOffice

I have a paper document. There are more pages containing a table with 3 columns (current number, name and a grade). I scanned it and got 16 jpeg documents. Each jpeg is a scanned page. Now, I need an OCR to convert each jpeg into text, in order to…

ocr

asked Jul 03 '13 at 14:14

Mihaita

votes

2 answers

How to wildcard tesseract?

I want tesseract to convert all the files of a folder. I do not want to merge the files in any way as I am having trouble with programs like hocr2pdf and pdfbeads merging more than one file at a time. I run tesseract *.tif * hocr and end up with…

ocr

asked Mar 30 '13 at 12:07

user140393

vote

1 answer

Tesseract and OCRopus

I was wondering what relations are between Tesseract and OCRopus? Is OCRopus a wrapper of Tesseract? Or are they now developing independently? What are some advantages of one over the other? Thanks and regards!

ocr

asked Jul 30 '11 at 21:16

Tim

25,177

vote

1 answer

Tesseract OCR Engine on ubuntu how to

I've installed tesseract-ocr. I was looking at the manual, but i can't see an option that i can define an image bounds (X,Y,W,H) Can someone help about it , or am asking in a wrong place ?

ocr

asked Mar 12 '14 at 19:35

Ahmed Al-attar

vote

2 answers

"sh: 1: cannot open /tmp/pdfsandwich4e375e.html: No such file" when using pdfsandwitch

I tried to add a textlayer to some pdf files in order to make them searchable. This technique is explained in the german Ubuntu wiki: http://wiki.ubuntuusers.de/pdfsandwich . After installing dependencies sudo apt-get install imagemagick exactimage…

ocr

asked Jun 16 '13 at 10:54

highsciguy

vote

3 answers

gimage reader OCR

I have recently installed gimage reader OCR. It is not obvious how to use it. I have not yet worked out how to get an editable text file. My aim is to get a libreoffice file to edit and save. Thanks in advance. The original text is standard English…

ocr

asked Nov 04 '18 at 09:03

TonyB

votes

2 answers

Convert hand written data log to excel

I have to enter loads of hand written data into excel and I was wondering if there is an easier way of doing it than typing all the data into the excel manually. Any suggestions?

ocr

asked May 23 '16 at 09:39

babaInBlack

votes

2 answers

OCR for TAN list (online banking)

I have a TAN list on paper for online banking that looks like this: 001 123456 015 123456 029 123456 043 123456 ... 002 123456 ... ... I scaned it and now I want to use OCR to get the text. I tried tesseract, gocr and cuneiform. All programms…

ocr

asked Jan 14 '15 at 06:29

guettli

1,777

-1

votes

1 answer

Ocr can't recognize a specific image

I am seeking to make these images , (8,0) recognized by an Ocr I am using tesseract but i don't mind if another Ocr make it

ocr

asked Dec 16 '15 at 17:24

MRTgang