99

What can I use to find duplicate photos, including photos that have been resized?

Zanna
  • 70,465
ændrük
  • 76,794

7 Answers7

83

digiKam Install via the software center

Add all the photos to your collection. In the menu, select Tools / Find duplicates. This will look for duplicates across your whole collection.

findimagedupes Install via the software center

A command line tool. Pass all the images you want to compare on the command line.

Geeqie (formerly GQview) Install via the software center

In the menu, select File / Find duplicate. Drag and drop image files do the duplicates window. You can drop directories to add their contents recursively. For visual comparison of images, there are specific, non-default options on a drop-down menu. The "custom" level of similarity allows restricting pairings only to the highest degree of similarity, but it has to be set on Preferences as 99. Even then, it does not work perfectly at least for some kinds of images, like line-art. It unfortunately does not provide an automatic selection mechanism with rational criteria, such as resolution, date or whatever, the automatic selection seems to just randomly just pick the first image found as the reference to preserve. Deleting many images can be extremely slow, as it tries to update the result count at every delete.


All three of these tools find visual duplicates, not just files that are identical byte for byte.

  • 8
    I found that Geeqie works the best. It has a robust set of search modes(name,checksum,size,etc...), powerful Image similarity scanning, detailed info of found duplicates, Simple UI, and there's no need to add images to a collection or album first. My only Cons is that the Duplicate Finder is hidden under the File menu and you have to Drag&Drop from Nautilus(or other FM) in order to add Images/Folders to be searched. Other than that it gets the Job done and does it well. – japzone Feb 25 '13 at 03:40
  • 3
    Geeqie can find similar images and it works pretty well, but I found it a bit slow for exact matches and it tedious to remove many duplicates with it. – Wernight Jul 15 '13 at 07:16
  • Another choice (which seems to work rather well), is this tool also called findimagedupes, but is unrelated to the tool hosted on Sourceforge. – Winny Jan 12 '16 at 19:10
  • Digikam has an amazing duplicate finding interface. I highly recommend. – wbkang Jul 15 '18 at 15:36
  • 1
    @wbkang How to "batch" remove multiple duplicates in Digikam? How to ensure only the lower (or equal) resolution duplicate is deleted in this exercise? – nutty about natty Feb 01 '20 at 21:45
  • findimagedupes works well. – Digger Jun 01 '20 at 21:37
  • I just noticed an issue where Digikam never actually deletes the duplicates from the original location. For example, if you point it to an image folder that you want to search for duplicates in, digikam will find the duplicates, but if you try to delete them it will only move it to an internal "Trash" folder, and NOT actually delete the original file. – Raleigh L. Nov 22 '22 at 22:14
22

FSlint

fslint is a graphical program that can find duplicate files of any type by md5sum. If the images are not identical, they won't be flagged as duplicates. The image below shows a bunch of duplicate pdf files in my Downloads directory:

enter image description here

You can change the advanced search parameters to search by file type and restrict yourself to images only. That's done through changing the "extra find parameters" as find command options. For example, here I am only looking for *.jpg files (in the same path, only looking at my "Downloads" folder:

enter image description here

fdupes

fdupes is an equivalent command-line based tool. Both are available in the repos.

chb
  • 144
John Lyon
  • 1,031
  • 15
    Note that I doubt these programs will find resized duplicates. – Vadim Peretokin Oct 09 '12 at 23:51
  • @Vadi that's a different, and more complicated question. Tineye does image identification which doesn't rely on metadata, hashes, etc (it can identify similar looking images) but that's an online service. They provide an API but I'm not aware of any applications that take advantage of this yet. The other complication is that you wouldn't want to remove similar images all the time, for example if you edit photos but want to keep copies of the originals. Removing identical duplicates is much safer. – John Lyon Oct 10 '12 at 00:03
  • 5
    The OP explicitly states "including photos that have been resized" so this is not an answer. – Calimo Jan 01 '19 at 14:19
7

fdupes

You can use a command line tool called fdupes to find duplicate files (see man fdupes for more details). I don't know of any way to find 'duplicates' that have been resized. A program that did this would require some sort of intelligent algorithm that analyzed the image contents, because when an image is resized its data is changed, so traditional duplicate finding methods would not work.

To install fdupes in all currently supported versions of Ubuntu open the terminal and type:

sudo apt install fdupes
karel
  • 114,770
dv3500ea
  • 37,204
  • fdupes will also miss duplicates in different directories; let's say you have two copies of a photo one in folder birthday-party/ and the other in family-stuff/ ... "fdupes -fr ." will miss this duplicate. – lrkwz Mar 11 '14 at 22:57
  • 3
    fdupes does not handle duplicates that have been resized, nor changes in metadata. – Calimo Jan 01 '19 at 14:21
7

dupeGuru Picture Edition works absolutely great, and is worth trying.

They have a Launchpad PPA, dupeguru (new all-in-one package) or dupeguru-pe (old picture edition package) can be installed from it using those commands:

sudo add-apt-repository ppa:hsoft/ppa
sudo apt-get update
sudo apt-get install dupeguru
Byte Commander
  • 107,489
tuxflo
  • 91
  • 1
    Looks like dupeGuru now has no separate editions. It works well, though UI could be better. It's also available in AUR if you use Arch. – user31389 Nov 11 '16 at 18:17
  • This is heavily outdated now, last version in PPA is xenial, the web doesnt work anymore... – jaromrax Apr 09 '21 at 17:14
7

imgSeek Install imgseek

imgSeek can find duplicates as well as similar pictures (so it should be able to find resized photos and photos with different filenames and metadata) and even search photos based on a sketch. It is available in desktop and server versions.

I haven't actually tried it myself, though.

lofidevops
  • 20,924
4

I've written this Python script to find visually similar images, and delete all but the largest one.

It uses findimagedupes internally, to find the duplicate images.

It can be invoked with the -d and -r options for your use-case which would:

  • Not delete (the lower-sized visually-similar) files.
  • Output a "dups.txt" file which would contain the duplicate (visually-similar to be precise) files.

https://github.com/AnirudhKishan/DeleteVisuallyRedundant

Anirudh
  • 61
2

Visipics

Visipics is a free Windows application for that function, but works just fine on Linux, via wine, of course (It's better than geeqie/gqview regarding the sorting of the duplicates (geeqie's results are absolutely "un-sortable")).

You can tell it to auto-select the images based on criteria such as smaller file size, non-compressed type, lower resolution (it won't do the opposite though, you'd need to do it manually, which wouldn't be much better than doing it on geeqie, except that the selection doesn't require holding Shift/Ctrl), and even prioritize folders (but the last priority is folder priority).

You must pay attention to symbolic links, though -- it can "randomly" select to save a symbolic link to a file while deleting the actual file as a "copy". That's a shame.

the dsc
  • 79
  • 1
  • 3