3

I found few similar questions in this site but could not complete the process.

From the Answer of How can instantaneously extract text from a screen area using OCR tools? and How can I use OCR on a partial screen capture to get text?

First I installed the dependencies

sudo apt-get install tesseract-ocr
sudo apt-get install imagemagick
sudo apt-get install scrot
sudo apt-get install xsel

Then I put the following script in /home/blueray/Documents/Translate/screen_ts.sh

#!/bin/bash 
# Dependencies: tesseract-ocr imagemagick scrot xsel

SCR_IMG=`mktemp`
trap "rm $SCR_IMG*" EXIT

scrot -s $SCR_IMG.png -q 100    
# increase image quality with option -q from default 75 to 100

mogrify -modulate 100,0 -resize 400% $SCR_IMG.png 
#should increase detection rate

tesseract $SCR_IMG.png $SCR_IMG &> /dev/null
cat $SCR_IMG.txt | xsel -bi

exit

Please note that I removed

select tesseract_lang in eng rus equ ;do break;done
# Quick language menu, add more if you need other languages.

In the hope that it will only consider english. Please let me know if this is not the case.

Now when I put

bash /home/blueray/Documents/Translate/screen_ts.sh

It works as I wanted.

In windows, with Capture2Text, I used to use Win+Q to capture part of the screen as text. So, I checked How do I set a custom keyboard shortcut to control volume?

I went to Menu-> Searched for Keyboard Shortcuts -> Click

enter image description here

  1. Then I clicked Add
  2. Name: Capture2Text
  3. Command: bash /home/blueray/Documents/Translate/screen_ts.sh
  4. Clicked Apply
  5. Clicked On Shortcut on the right.
  6. Pressed WinQ

Now when I press WinQ, nothing happens. What am I doing wrong?

muru
  • 197,895
  • 55
  • 485
  • 740

2 Answers2

3

You don't need "scrot". Imagemagick (which provided "mogrify") can do the job of screen capture. You also don't need to save an intermediate image, as "tesseract" can accept an image on standard input.

As such the above simplifies to...

convert x: -modulate 100,0 -resize 400% -set density 300 png:- |
  tesseract stdin stdout | xsel -bi

However I also added the following to my version of the script, to pop up the text on screen so you can check it.

xsel -po | xless - &

Of course tesseract could use some improvements for some fonts! For example 'f's in some fonts have a small hook that makes tesseract think they are 'P's! Arrghhhh...

EDIT: Full script I use is located at...

https://antofthy.gitlab.io/software/#capture_ocr

I link this to a 'hotkey' (Meta-Print) using my window manager (openbox), so I can use it at any time.

If you can't use a hotkey, and need to uncover the part of the screen containing the text you can always launch it with a delay...

sleep 5; capture_ocr

Enjoy

anthony
  • 334
  • Can you please give the full script so that I can just copy-paste (with xsel -po | xless - &). – Ahmad Ismail Oct 20 '18 at 03:31
  • 1
    Updated answer with link to full script (and many other scripts). – anthony Oct 22 '18 at 06:51
  • Full script that works for me: convert x: -modulate 100,0 -resize 400% -set density 300 png:- | tesseract --dpi 300 stdin stdout | xsel -bi. Initially, tesseract was printing Warning: Invalid resolution 0 dpi. Using 70 instead. Addition of --dpi 300 parameter fixed it (found it on GitHub). Setup: tesseract 4.1.1 on Fedora 33. – ilyazub Jul 09 '21 at 09:57
1

I had to tweek @anthony 's script so it would work on my box (Kubuntu 18.04):

Instead of the convert line, I used:

import -resize 300% +dither png:- |

Also, I removed the trailing minus - sign from the last line, so:

xsel -ob | $XPAGER

Working great.

pasqal
  • 11