I am seeking to make these images ,
(8,0) recognized by an Ocr
I am using tesseract but i don't mind if another Ocr make it
Asked
Active
Viewed 120 times
1 Answers
1
We should call tesseract with option -psm <N>
for the page setup:
0 = Orientation and script detection (OSD) only.
1 = Automatic page segmentation with OSD.
2 = Automatic page segmentation, but no OSD, or OCR.
3 = Fully automatic page segmentation, but no OSD. (Default)
4 = Assume a single column of text of variable sizes.
5 = Assume a single uniform block of vertically aligned text.
6 = Assume a single uniform block of text.
7 = Treat the image as a single text line.
8 = Treat the image as a single word.
9 = Treat the image as a single word in a circle.
10 = Treat the image as a single character.
The options of interest are 10
and 6
in case we only have a single character in our bitmap source.
By rendering the gray image source as follows
tesseract LO1v5.png -psm 6
we will get a correct result of 8
, but the green image source is too much of a challenge for tesseract which is specialized on whole texts rather than numbers.
By improving the input quality
we will get better results on calling tesseract in single character recognizing mode:
tesseract sourceimage -psm 10
This will give us a correct guess of 8
but only an almost correct guess of B
for the 0
-image.

Takkat
- 142,284