164

I have a bunch of text files, images and pdf files which I want to convert into a single pdf file. How do I do it?

Zanna
  • 70,465
AJha
  • 2,843

15 Answers15

213

If you're willing to use a terminal, you can use ImageMagick. Install it with

sudo apt install imagemagick

then you can do:

convert image1.jpg image2.png text.txt PDFfile.pdf outputFileName.pdf

or as another example:

convert *.jpg outputJpgFiles.pdf

It worked for me, but the problem is it converts the text.txt file into an image, so you can't highlight the text in the resulting pdf.

Garrett
  • 720
  • 3
  • 13
  • 30
  • 5
    ImageMagic convert reduces quality and increases size in my experience. Try pdftk. But not sure how you add images there. – akostadinov Sep 22 '14 at 12:22
  • 6
    You can tweak with -quality flag to increase or decrease the resulting PDF file size. Example: convert -quality 50 image1.jpeg image2.jpeg image3.jpeg outputFileName.pdf – RajaRaviVarma Feb 04 '15 at 18:36
  • 6
    Be aware that convert uses ghostscript under the hood and gs will decode and reencode JPEGs which result in a loss of quality, even if speciiy a high quality. – tobltobs May 04 '15 at 09:28
  • 6
    I had to create more than 6000 pdfs starting from 30000 tiffs. convert estimate time ~6-7 hours. I used tiffcp and tiff2pdf, they took few seconds. – j.c May 04 '17 at 10:15
  • You can install it via sudo apt install imagemagick. Otherwise you don't have the convert command. – Melroy van den Berg Aug 13 '18 at 18:50
  • 10
    uh... wth?! When I run convert with two png files as input and "Test.pdf" as the output, I get this error: convert-im6.q16: not authorized \Test.pdf' @ error/constitute.c/WriteImage/1037.` – Michael Dec 23 '18 at 22:24
  • 5
    @Michael, see https://stackoverflow.com/a/52661288/276052 – aioobe Dec 25 '18 at 14:57
  • @AlaaAli, you can take your resulting PDF and make it searchable again using my pdf2searchablepdf tool I wrote, which I describe here. It's a wrapper around the tesseract OCR engine. – Gabriel Staples Jan 08 '22 at 07:12
  • convert doesn't work for me. :( Example run: convert *.jpg my.pdf or convert pg-1.jpg pg-2.jpg out.pdf. They both produce the following error: convert-im6.q16: attempt to perform an operation not allowed by the security policy \PDF' @ error/constitute.c/IsCoderAuthorized/408.` – Gabriel Staples Jan 08 '22 at 07:21
  • 2
    I found a solution for the convert-im6.q16: attempt to perform an operation not allowed by the security policy \PDF' @ error/constitute.c/IsCoderAuthorized/408.` error!: see https://stackoverflow.com/a/53180170/4561887 – Gabriel Staples Jan 08 '22 at 07:35
  • This doesn't work on text files at all. I get error: convert-im6.q16: improper image header \text.txt' @ error/txt.c/ReadTXTImage/450.` – Gabriel Staples Jan 08 '22 at 18:16
  • https://askubuntu.com/a/1127262/1191829 checkout this link for the error (works for images) – Lawhatre Jul 30 '22 at 14:30
  • do <policy domain="coder" rights="read|write" pattern="PDF" /> in /etc/ImageMagick-6/policy.xml ( https://linuxhint.com/convert-image-to-pdf-command-line/ ) – SL5net Oct 04 '22 at 11:22
41

Install pdftk

sudo apt-get install pdftk

Pdftk

If PDF is electronic paper, then pdftk is an electronic staple-remover, hole-punch, binder, secret-decoder-ring, and X-Ray-glasses. Pdftk is a simple tool for doing everyday things with PDF documents.

You can create pdf files from text or images with Libre Office then to stitch these togeter with other pdf files

pdftk 1.pdf 2.pdf 3.pdf cat output 123.pdf

It can also

  • Split PDF Pages into a New Document

  • Rotate PDF Pages or Documents

and a lot more besides

More details here: Ubuntu Geek: List of PDF Editing tools

Fei
  • 103
Warren Hill
  • 22,112
  • 28
  • 68
  • 88
21

Try PDF Chain:

PDF Chain is a graphical user interface for the PDF Toolkit (PDFtk). The GUI supports all common features of the command line tool in a comfortable way.

enter image description here

You can install it either from the default repos, or get the latest and greatest from PDF Chain PPA.

sudo apt-get install pdfchain

Or PDF Mod:

PDF Mod is a simple application for modifying PDF documents.

You can reorder, rotate, and remove pages, export images from a document, edit the title, subject, author, and keywords, and combine documents via drag and drop.

sudo apt-get install pdfmod

enter image description here


See also:

landroni
  • 5,941
  • 7
  • 36
  • 58
15

For multiple files inside a directory and its subdirectories with different extensions I couldn't find a neat answer, so here it is

convert -quality 85 `find -type f -name '*.png' -or -name '*.jpg' | sort -V` output.pdf

I used command substitution to pass the selected items returned by find command as an argument to convert command. Unfortunately sort -n didn't sort my files correctly so I tried -V option and it did the trick. Also make sure the name of your files and directories are in natural sort order in advance. For example dir1, dir2, dir3 not dir1, dir_2, dir3.

Dante
  • 2,329
  • 5
  • 27
  • 35
5

This is the solution i used to convert multiple TIFFs to PDFs.

I had to create more than 6.000 PDFs starting from 30.000 tiffs. convert estimate time: 6 to 7 hours.
I used tiffcp and tiff2pdf, they took few seconds.

$ tiffcp 1.tiff 2.tiff ... multi.tiff
$ tiff2pdf multi.tiff > final.pdf

This way is really fast because images are not converted, just packed.

Maybe there are some tiff formats that doesn't work so easily, for me it worked perfectly.

Hope it helps.

j.c
  • 238
4

Install Master PDF editor. The tool offers creating, merging and extracting PDF files. Check here for details about master PDF editor and installing it on Ubuntu

Fabby
  • 34,259
surendar
  • 243
  • 4
  • 7
  • 19
2

There is series of utilities in package texlive-extra-utils wrapped on pdfjam. To join pdfs use

pdfjoin -o out.pdf 1.pdf 2.pdf 3.pdf

Unlike convert it directly manipulates on pdf without converting them to images.

Also on 18.04LTS (Bionic Beaver) at this moment package pdftk is not supported. I would recommend pdfjam if someone prefers to use command line.

kubus
  • 164
  • The OP is asking how to convert image1.tif image 2.tif image 3.jpg to images.pdf – Lexible Apr 25 '19 at 17:12
  • Not exactly. OP also wants merge pdf files with images. It is better to create pdf files from images using img2pdf command, since it creates pdf containing original image, and then use pdfjoin. – kubus Apr 27 '19 at 11:28
  • pdftk in 18.04: https://askubuntu.com/a/1165823/925128 (CLI only, as pdfchain is not available in 18.04 afaik) – cipricus Dec 04 '19 at 12:34
  • pdfjoin will also create pdfs from images, e.g. pdfjoin -o images.pdf *.png – David C. Rankin Sep 09 '21 at 18:10
2

I use PDF-Shuffler for this kind of use, it works great.

sudo apt-get install pdfshuffler

It is a graphical tool. You simply load all the pdf files you want to fuse. You can change the page order as you wish.

cochisebt
  • 320
  • 2
  • 7
  • Can you include instructions on how to do what the OP wants? – Seth Jan 06 '14 at 04:12
  • It is done. :-) – cochisebt Jan 06 '14 at 04:24
  • 3
    I would downvote, but have not enough rep. PDF-Shuffler accepts only PDF files. Question also included image files and text files. – borisdiakur Jan 31 '14 at 12:17
  • With Libreoffice you can convert text files to pdf. As it is also possible to insert image files in Libreoffice, then convert in pdf. Once everything is in pdf, Pdf-Shuffler can do the job. But I don't think one software can do all the job at once. – cochisebt Jan 31 '14 at 15:24
  • As of 22.04, pdfshuffler (which has been renamed as "PDF Arranger") can do what the OP asked. I needed a multi-page PDF, consisting of two jpg images: just dragged them into a new blank canvas in PDF Arranger, saved as new PDF and voilà... – sxc731 Sep 11 '22 at 11:22
2

I can't believe nobody has mentioned latex (tex) yet. It is specifically designed for producing documents, and can combine text, images, and PDFs into a 'master' document (without any degradation of quality). It is a full suite of libraries and an extensible markup language, basically - it's been around since forever and highly used in the scientific community, still.

Technically, it's a typesetting language.

  • 1
    This seems more a comment then an answer... please review http://askubuntu.com/help/how-to-answer – Marcellinov May 14 '16 at 20:05
  • 3
    Probably because nobody expected the OP's question to be about creating a day job-level work flow. Talk about using a nuke to start a camp fire! (I am a TeX fan, BTW, but I would never use it for this purpose.) – Lexible Apr 25 '19 at 17:09
  • This is interesting. OP, could you please demonstrate it with some sample code? – Nav Feb 06 '21 at 04:15
2

Try LaTeX with pdflatex.

I had never used it before but it took me about 10 minutes to start making .PDFs with it and about 40 minutes to get them customized exactly as I wanted. I included the best formatting guides I found, at the end.

sudo apt-get install pdflatex && sudo apt-get install texlive

Basically you create one .tex file - for example hello.tex - with the LaTeX language, then run pdflatex hello.tex on that file and it will generate the PDF. The basics of the language can be found here: http://www.maths.tcd.ie/~dwilkins/LaTeXPrimer/


Here is a barebones example .tex file:

 \documentclass[a4paper,10pt]{article}

 \begin{document}

 {\footnotesize
 YOUR TEXT HERE

 YOUR TEXT HERE
 }

 \end{document}

Optional extra formatting:

To add images: https://www.sharelatex.com/learn/Inserting_Images

For different font sizes: https://www.sharelatex.com/learn/Font_sizes,_families,_and_styles

For different fonts: https://www.sharelatex.com/learn/Font_typefaces

To change page size and margins when using pdflatex: \usepackage[pass,paperwidth=148mm,paperheight=210mm,margin=5mm]{geometry}

2

Adding on the community answer above, you can do convert 'ls *.jpg -tr'. To force the PDF file to have the images in chronological order.

1

Using Gimp, import as layers, export as pdf:)

Gimp version: 2.10.8

Spyros
  • 42
  • This is the best answer in my case because one of the images was itself a PDF that required additional processing and convert was unable to properly, uh, convert it. I didn't expect GIMP to handle PDFs so well. Thank you! – Andy Mikhailenko May 24 '21 at 19:49
  • How is that done exactly? There is no gui option to export as pdf but the extension can be modified manually. At 260 pages as tif files it crashed though. – cipricus Jan 17 '22 at 13:20
  • In the menu, under there exists an option as well as one . Version: 2.10.30 – Spyros Jan 17 '22 at 19:01
1

1. Images to PDF

A tool I wrote called pdf2searchablepdf can combine many images into a single PDF. It is particularly good if you want the final PDF to have searchable text in it, as my tool performs OCR (Optical Character Recognition) on the images using a program called tesseract in order to bundle them into a single PDF.

Installation instructions are here: https://github.com/ElectricRCAircraftGuy/PDF2SearchablePDF#install

Since pdf2searchablepdf is a wrapper around tesseract, it accepts any image format supported by tesseract, which includes bmp, pnm, png, jfif, jpeg/jpg, and tiff. Gif is not supported. See https://coptr.digipres.org/index.php/Tesseract-ocr:

Any image readable by Leptonica is supported in Tesseract including BMP, PNM, PNG, JFIF, JPEG, and TIFF. GIF is not supported http://www.leptonica.com/library-overview.html.

To convert all images into a PDF, they need to be all in the same folder and with nothing else in that folder. So, assuming you have img1.jpg, img2.jpg, and image3.jpg, you could do this:

# Create an `images` dir and move all images into it
mkdir -p images
mv *.jpg images  # use `cp` instead of `mv` to copy instead of move the images

Now combine all of these images into 1 pdf

pdf2searchablepdf images

That's it! You'll now have a searchable PDF file called images_searchable.pdf in the directory you were in when you ran the pdf2searchablepdf command.

Note: to go the opposite direction and convert a PDF file into a bunch of image files, I like to use pdftoppm as I explain here.

To convert a non-searchable pdf named input.pdf into a searchable pdf named input_searchable.pdf, do:

pdf2searchablepdf input.pdf

See pdf2searchablepdf -h for the full help menu, including options and other examples.

2. Text to PDF

See: https://stackoverflow.com/questions/20129029/a-light-solution-to-convert-text-to-pdf-in-linux/20129300#20129300

3. PDF to single PDF

See: https://stackoverflow.com/questions/2507766/merge-convert-multiple-pdf-files-into-one-pdf/11280219#11280219

  • @nsandersen, which first step? mkdir -p images? sudo apt update? Please be more specific so I can help you. Did you follow my pdf2searchablepdf installation instructions? – Gabriel Staples Aug 05 '22 at 16:00
  • @nsandersen, yes, please open an issue here to explain in detail your OS, your steps, the problem, and to paste all output and error messages: https://github.com/ElectricRCAircraftGuy/PDF2SearchablePDF/issues. We can continue to debug it there. – Gabriel Staples Aug 09 '22 at 15:34
0

For multi-page pdf:
Convert all files to pdf, then join using a pdf writer eg. pdftk, pdfill, Microsoft Print to PDF, CutePDF, etc

For single-page pdf:
Convert all files to images eg. PNG, named in sequence. Then join to one page with image converter eg. imgconv

imgconv.exe -append *.png out2.pdf (for vertical)
imgconv.exe +append *.png out2.pdf (for sideways)

if you're not on Win 10 WSL, load ImageMagick:

sudo apt-get install imagemagick

The converter, installed as part of imagemagick converts to one pdf:

convert "*.{png}" -quality 100 combined.pdf

If you already have single page pdfs, merge them with pdftk:

pdftk *.pdf cat output combined.pdf
Zimba
  • 111
  • 2
0

To elaborate on @Veles' answer -

Using GIMP 2.10.30 -

  1. File -> Open as Layers...
  2. Select all images and click Open
  3. File -> Export As...
  4. Edit name with extension as .pdf
  5. Select 'Layers as pages (top layers first)'
  6. Select 'Reverse the pages order'
  7. Export