49

How can I extract text from images?

I am not talking about scanned files, but garden variety images, such as when you take a high-def picture of a blackboard at class, and it is nicely handwritten; or when you photograph a page from a recipe book and want the recipe in text format.

Any free and open software for that?

I tried tesseract, and the results were awful.

Zanna
  • 70,465
Strapakowsky
  • 11,914

5 Answers5

41

tesseract-ocr would be the great one compared to all others. For Installation, run the below command

sudo apt-get install tesseract-ocr

Usage is tesseract filename.jpg output.txt, then it will generate output.txt file.

You might consider selecting the appropriate language. In that case, you will need to install tesseract-ocr-LANG package, where LANG is the three-letter ISO 639-2 language code. Right now you have 123 languages on 18.04 repo. Then use for example:

tesseract mySpanishText.jpg output -l spa
nomadSK25
  • 113
  • 5
  • 1
    Hey, so this does works but is not accurate or I would rather say is 80-85% accurate. Like example for this image: https://pbs.twimg.com/media/DJs6_pcXkAA2VrN.jpg, it messed up $ sign and also most brackets. Square, round, curly, all brackets are a problem, they never get extracted properly. Do you know of any fix? – Milan Chheda Sep 19 '17 at 14:08
40

The act of extracting text from images is called OCR and Ubuntu has a wiki page dedicated to OCR. From that page:

Available OCR tools

The Ubuntu Universe repositories contain the following OCR tools:

  1. gocr - A command line OCR
  2. fuzzyocr - spamassassin plugin to check image attachments
  3. libhocr0 - Hebrew OCR
  4. ocrad - Optical Character Recognition program
  5. ocrfeeder - Document layout analysis and optical character recognition system
  6. ocropus - document analysis and OCR system
  7. tesseract-ocr

The Ubuntu multiverse respositories also contain:

  1. cuneiform - multi-language OCR system

Some packages are outdated, but unofficial fresh ones can be found in Alex_P PPA (PPA adding code: ppa:alex-p/notesalexp). If you never used a PPA check how to add software from a PPA.

edit: As shown in comment Clara OCR exists too but it got stuk at Hardy and their website has 2009 as last updated.

Rinzwind
  • 299,756
  • Do you have experience using any of those for the examples I described? I became a bit sceptical to regular ocr tools for them. Number 7 on the list is the one I tried and was plainly terrible. – Strapakowsky Aug 31 '11 at 09:11
  • If I recall, I tried gocr also, with equivalent terrible results. If you tried with success any of those, what syntax did you use? Thanks. – Strapakowsky Aug 31 '11 at 09:13
  • None whatsoever! I never bothered with OCR :D Freshmeat search shows Clara OCR and tesseract-ocr ;) ( http://freshmeat.net/search/?q=%2BOCR&filter=&filter_scope=&orderby=popularity_percent_DESC ) – Rinzwind Aug 31 '11 at 11:16
  • Am I wrong if I say that successful use of OCR requires knowledge of the process and a careful setup to fit the particular image to be scanned? Thus, if I'm right, bad results might be due to the user and not the software. – N.N. Aug 31 '11 at 11:37
  • OCR works best if you know how the image is created and you are very well versed in using the software that you use (the latter being the reason I never got around to using it). – Rinzwind Aug 31 '11 at 11:43
  • For tesseract and cuneiform, you can use the YAGF program as a GUI. This has a rather good interface, and the required area of the image can be selected easily. As N.N. says, it is sometimes the user not the software - i.e., running the command to read the text from an image probably means it will try and read the ENTIRE image. Have fun teaching your computer to read. – Wilf Nov 17 '13 at 08:11
4

Frog

Try Frog. Frog is an intuitive text extraction tool (OCR) for GNOME.

screenshot

Get it from the Snap Store Download on Flathub

Flimm
  • 41,766
  • There are perfectly good solutions that exist within mainstream repos. I didn't think this adds anything to the accepted answer. At best, wheel reinvention, at worst, potential security risk with lesser audited non-official repo install. – moo Feb 16 '23 at 19:55
  • 1
    I have brought the negative downvote back to 0, since this is the best GUI suggestion, sorry it took years to correct since I just saw this now, sorry the toxic part of the community got to you, I frankly think this is the best answer if you don't want to work with the command line, Thank You! – king_below_my_lord Apr 11 '23 at 06:58
  • Thanks @king_below_my_lord. I've deleted my comments, since I don't think they serve a purpose any more. – Flimm Apr 11 '23 at 07:01
  • Alternative 2023: sudo apt install zbar-tools then sudo snap install frog for Ubuntu 23.04 – NingaCodingTRV Jul 24 '23 at 23:24
  • @NingaCodingTRV Why install zbar-tools? I've updated my answer to include a link to snap – Flimm Jul 28 '23 at 10:15
  • @Flimm it's a dependency, Frog use it to generate QRs. – NingaCodingTRV Aug 22 '23 at 21:35
2

TextSnatcher

Try TextSnatcher. This application uses the Tesseract OCR 4.x for the character recognition behind the scenes.

Screenshot

Probably the easiest way to install it on Ubuntu is to get it from Flathub:

  1. First, if you haven't already, install Flatpak using the Ubuntu quick start guide. Remember to restart your system afterwards.

  2. Go to TextSnatcher on Flathub and click Install. Or, if you prefer the command-line, run this command:

    flatpak install flathub com.github.rajsolai.textsnatcher
    
Flimm
  • 41,766
  • I didn’t downvote (and don’t think it deserved a downvote - though fortunately you have some headroom before your reputation is seriously eroded!), but my only comment would be that in my opinion, your 2 answers would be better as one, with multiple recommendations, rather than two separate answers. – Will Feb 16 '23 at 20:22
  • @Will Interesting, why do you say that? If you wanted to upvote one recommendation and downvote the other, how would you do that if the answers were combined? If you wanted to only read comments about one piece of software, wouldn't it be better to have separate posts? AskUbuntu allows multiple answers from the same user for a reason, I think it's precisely for cases like this one. – Flimm Feb 16 '23 at 20:30
  • I suppose it’s preference; the question is ‘how can I do x’ and a good answer to me is one that presents a good number of options to help them decide what to do. I think it very unlikely I’d upvote one option and downvote another - but I’d likely upvote an answer that gave the op the options they needed to make a decision. But it’s only the way I would like answers to my questions - I’m no authority on how to answer! – Will Feb 16 '23 at 20:38
  • 1
    Upvoted this up, unlike Frog having a Tesseract OCR based solution might be preferable to some, both quite frankly excellent choices with I preferring Frog, though this might admittedly do better with handwriting detection etc – king_below_my_lord Apr 11 '23 at 07:01
0

Using tesseract-ocr we can extract text from images. I have tested gocr which didn't work well as compare to tesseract-ocr

Installation:

sudo apt-get install tesseract-ocr

Python program to convert all the image files with png extension inside of current directory to txt file

#!/usr/bin/env python3.10
import os
import subprocess

def list_files(path): files = [] for name in os.listdir(path): if os.path.isfile(os.path.join(path, name)): files.append(os.path.join(path, name)) return files

def convertImageToText(img_file): #process = subprocess.Popen(['tesseract', img_file, # ''.join(img_file.rsplit('.png', 1))]) os.system(f"tesseract {img_file} {''.join(img_file.rsplit('.png', 1))}")

def startOperation(): list_file = list_files(".") print(list_file) for img_file in list_file: if img_file.lower().split(".")[-1] == "png": convertImageToText(img_file)

startOperation()

devpa
  • 843