4

I use UbuntuStudio 20.04.

I want to know if there is some Linux native app (with GUI) to eliminate duplicate files, not only related with the names of the files, but its content.

I'm talking about to compare the lenght and the internal values to get if there are duplicated files even if their names aren't the same (example: the same video with different names in different folders).

Is there something like that? Where?

Juan
  • 1,797

4 Answers4

2

dupeGuru

Take a look at dupeGuru. Is GUI alternative.

dupeGuru is a cross-platform (Linux, OS X, Windows) GUI tool to find duplicate files in a system. It’s written mostly in Python 3 and has the peculiarity of using multiple GUI toolkits, all using the same core Python code. On OS X, the UI layer is written in Objective-C and uses Cocoa. On Linux & Windows, it’s written in Python and uses Qt5.

Pablo Bianchi
  • 15,657
0

fdupes

Don't know about a GUI program but, there is a powerful CLI tool: fdupes

Searches the given path for duplicate files. Such files are found by comparing file sizes and MD5 signatures, followed by a byte-by-byte comparison.

Source code on GitHUB
Ubuntu ManPage

Install

sudo apt install fdupes

Usage

# List dupes in an entire directory
# -r : recursive
fdupes -r /path/to/adir
# List and delete dupes in a dir
# -r : recursive
# -d : preserve first file, delete other dupes
fdupes -rd /path/to/adir

Related: Where is fslint (duplicate file finder) for Ubuntu 20.04?

Pablo Bianchi
  • 15,657
cmak.fr
  • 8,696
0

I found a list of some GUI duplicate finders while doing searches and thought I would add to this since this page had pretty decent SEO rankings in my searches.

Note that these should all be disto-agnostic (e.g. they should work with Ubuntu as well as any other distro) but I have not tested them. The listed apps do all meet the requirements mentioned in the OP to the best of my knowledge (e.g. they are Linux native apps and have GUIs, from the descriptions they should all work as duplicate detectors as well). I have opted to try to keep my notes fairly distro-agnostic as well.

The link contains a table which attempts to do a feature comparison. I am not going to rip-off the table, but I understand that people like answers to contain more than just a link, so I will also list out the apps from the list along with some of my own notes:

  • DupeGuru: Same thing suggested in the other answer, this is probably what I would recommend for most users as well (especially for those on Ubuntu as it has an official PPA for Ubuntu Focal/Bionic/Xenial at time of writing or deb files on their GitHub releases page). If you are on a non-Ubuntu/Debian-based distro, there is a tar file or you can build from source.
  • Rmlint-GUI: Some pages I saw mentioned that this was formerly called "Shredder". I've seen a lot of good mentions online but have not tried it yet myself. I had found this documentation link prior to finding the linked comparison table. I'm not on an Ubuntu pc at the moment but according to repology it should be in the central repos for most distros including Ubuntu, Debian, Arch, Manjaro, Fedora and many others. It does not appear to have any precompiled binary releases on GitHub but can be built from source.
  • DetWinner: I don't really know anything about this one. One interesting thing from their GitHub readme is that is supports "searching and removing duplicate files and similar images". It has a flatpak available here. But it does not appear to be in any distro's central repos (per repology) nor does it appear to offer snaps or precompiled binaries. So if flatpaks aren't an option, building from source may be the only other route.
  • Czkawka: Neat rust-based GUI but might have an issue if you want report only without actually deleting the duplicates (see note and linked issue below - I did not verify this issue myself, just passing along the warning). According to this page, it should be available via snap, flatpak, appimage, rust's cargo manager, PPA, AUR (for Arch users), precompiled binaries on GitHub, or building from source.
  • FSlint: This is deprecated in Ubuntu 20.04 and later due to python2 dependencies but workarounds exist if you are ok with installing deprecated packages or using an unofficial snap. I do not recommend this for most users. (unless a new python3 fork comes out... I wouldn't hold my breath as the dev - pixelb - has already stated "I don't really have time for [continuing to maintain fslint] TBH." and recommended czkawka as a more modern replacement).

However, I think at least one of the linked projects may require manually compiling, unless you are ok with flatpaks and the like.

There's also at least one issue to be aware of: a user in the linked thread reported that "Czkawka will just delete all copies of a file without warning, unlike FSLint". I know one page I came across for it did mention a "Preview" button, so it is possible that there are different actions and only preview makes no changes. So I am not clear from the comments whether this was a bug / unexpected behavior of the program or some misunderstanding on the part of the user. Either way, if you opt to try Czkawk, it seems prudent that you run some tests on data you don't care about first.

Pablo Bianchi
  • 15,657
zpangwin
  • 273
0

rmlint

rmlint is a great tool written in Rust. It won't remove directly any file but generates a tuned rmlint.sh which offer several parameters that will do the task.

Examples:

# Search for duplicate files over 10MB with a progress bar
rmlint -T "df" --size 10M -g

Start the optional graphical frontend to rmlint called Shredder

rmlint --gui

rmlint finds space waste and other broken things on your filesystem and offers to remove it. It is able to find:

  • Duplicate files & directories.
  • Nonstripped Binaries
  • Broken symlinks.
  • Empty files.
  • Recursive empty directories.
  • Files with broken user or group id.

Another great alternative is czkawka

Pablo Bianchi
  • 15,657