How to “cat” or “less” a libreoffice impress .odp file?

Question

Is there any way to cat or less a .odp file just like we have for .odt files as mentioned here

All open office files are ZIP files. So unzip it. – Rinzwind Apr 10 '21 at 17:32 — Rinzwind, Apr 10 '21 at 17:32

score 4 · Accepted Answer · answered Apr 10 '21 at 19:21

The comment on your question, posted by Rinzwind gave me an idea. Thank you my friend. I didn't ever tried this before but, as long as you can extract your odt's, you can grep the odt's contents instead of the odt itself.

So the procedure would be something like this:

cd THE-FOLDER-CONTAINING-THE-ODT
mkdir ext
unzip THE-ODT-FILE -d ext
cd ext
grep KEYWORDS * -ri

Example:

In this example I am inside a folder named "ray", in which I have a document named practicas-ray-del-angel-gerardo-membretada-2021.odt. So I am going to use the same process just by changing the parameters in order to find the word "RAYMUNDO".

cd ray
mkdir ext
unzip practicas-ray-del-angel-gerardo-membretada-2021.odt -d ext
cd ext
grep "raymundo" * -ri

The next screenshot will give you an image on all the process.

Scripting

Of course this would be a major problem if you have to run this in a bunch of documents so I would put the process which will run using the proper parameters on the command line. The file will contain this:

odtgrep.sh

if [[ -d "ext" ]]
then
  echo "ext exists. Will not be created but we will remove the contents on it."
  rm ext/* -rf
else
  echo "First time run? Let's create the ext directory."
fi
unzip $1 -d ext
cd ext
grep "$2" * -ri

Explanation

Directory validation

First we will check if the directory ext does exist.

if [[ -d "ext" ]]
then
  echo "ext exists. Will not be created but we will remove the contents on it."
  rm ext/* -rf
else
  echo "First time run? Let's create the ext directory."
fi

If it doesn't exist, we need to create it. But if it already exists we just need to clean up whatever is inside the folder in order to prevent the zip command to stop asking if we wish to overwrite the existing files. The messages on the terminal are useful for debugging purposes but can be omitted when you wish.

Extraction

Once we are sure the ext directory is ready to be used, we will unzip the file, using the first parameter in the passed via command line to the script.

unzip $1 -d ext

After which we will enter the directory to work with the files via:

cd ext

Searching

This code will search the desired word(s), provided in the command line with the second parameter inside every single file on the ext directory.

grep "$2" * -ri

Please note the content.xml file is the one in charge to keep the text, so maybe you wish to change the code for this:

grep "$2" content.xml -i

Running

The script should be run with:

for i in *.odt; do bash odtgrep.sh "$i" "raymundo"; done;

This will:

Make an array with all the odt files in the current folder
Run the Script on every file in the array
Search the "raymundo" word inside every single file and show the results if found.

Considerations

Please make sure you use the proper variables and temporal (ext) folder in order to prevent damages on you data.
Make sure your quotation marks will work and avoid using spaces when possible.
When running on a bunch of files you will see a lot of data on your terminal, so you may wish to use the -l parameter on grep in order to show the less data possible.
The file name containing the text found on your search will be presented by the zip file itself with a text like: Archive: practicas-ray-del-angel-gerardo-membretada-2021.odt

Let us know if this works for you. Any comments or suggestions on how to improve this code are welcome.

I have understood the code and also tried it. But this shows a lot of text on the terminal. What I want is, only the specific line/list of lines containing the word I am searching for. Taking the example from your uploaded image, if I search for RAYMUNDO, then grep should print on terminal like this:
C. RAYMUNDO DEL AN GEL GERARDO

(if there is any other line also, that contains "RAYMUNDO", it should also be printed like this, but in a new line) so basically only I am discarding any special character, and only wish to obtain the line/lines (a sequence of alphabets (a-zA-Z)). — Suraj Singh, Apr 10 '21 at 20:12
@SurajSingh You did not specify any of that in your question. This guru has invested a lot of time --for free-- to answer the question you asked (and it's an excellent, complete, well-written answer). I've used programmatically unzipping, editing, and re-zipping LibreOffice files for years. — user535733, Apr 11 '21 at 00:04
@user535733 I am completely satisfied with the code. Through the comment, I am just asking if there is a way to refine the output, I am not denying the solution provided. As Geppettvs D'Constanzo himself welcomes comments and suggestions, so I am just asking him a more refined possibility. The thing is, I have logic in mind, but I don't know how to implement it, so am asking. — Suraj Singh, Apr 11 '21 at 01:19
@GeppettvsD'Constanzo thanks a lot. Your solution helped me to reach my objective. Also, I have successfully reduced the output on the terminal and only relevant things are getting printed (the same that I was asking in my first comment on your answer). After following your solution, I wrote a python script that basically finds the required word in the file, and then prints the complete line that contains it and piped the output of it with the grep command. It worked like Charm. — Suraj Singh, Apr 11 '21 at 03:23
@user535733, thank you for pointing that. I agree with you that the user should include enough information on what he wish to achieve but in the other hand, I am open to the new ideas and I always have in consideration that the user is asking something specific, which can be just a part of something bigger. That's why I keep the options open to new ideas and improvements for both the questions and answers. And in this case it seems that Suraj Singh is a new user so let's try to help him. Even when it seems he found a solution. — Geppettvs D'Constanzo, Apr 12 '21 at 19:19
@SurajSingh, good to know you found a solution from this procedure. Do you mind letting us know a bit more of your project? Maybe you have a blog post or something? Everyone (including me) will learn a lot from what you are doing! Thanks in advance. — Geppettvs D'Constanzo, Apr 12 '21 at 19:20
@GeppettvsD'Constanzo I don't have a blog over the project, I tried learning how to make a blog, but it's taking me much time. Talking about the project in short, just like the cat file.txt | grep 'mystring' will list out all occurrences of 'mystring' in file.txt, similarily I am trying to make a custom command that can work in a similar fashion but on any type of file (not only txt files). Like I give cat_it file.odp | grep_it 'mystring', here the cat_it command will follow the steps of your solution(silently), then grep _it command also do some work silently and finally the output. — Suraj Singh, Apr 14 '21 at 17:50
Very smart @SurajSingh. I've learned something new today. If you someday publish your work just drop a message over here. Thank you. — Geppettvs D'Constanzo, Apr 14 '21 at 20:24
Sure, I will definitely drop the link to the project soon after it is completed. I will love further collaborations or suggestions in that. — Suraj Singh, Apr 15 '21 at 07:12
@GeppettvsD'Constanzo the project is ready, kindly have a look at it here — Suraj Singh, Jul 02 '21 at 04:16