Using fdupes to auto delete files but keep the second file and not the first

Question

As the subject says I'm looking for a way to delete duplicate files automatically. I know I could run the command fdupes -rdN but that automatically chooses the first file while deleting the second.

Asking because my dupes are listed similar to this

/Documents/123Testfile.txt /Documents/Testfile.txt /Documents/84579875blahblahblahSecondTestfile.txt /Documents/SecondTestfile.txt

Checked the man pages and either I missed it or couldn't find an option where the first file was automatically deleted. Anyone know of a way to make this happen?

Thanks in advance

That's a really good question, actually. The man page, indeed, does not discuss this case. — Clément, Sep 12 '22 at 22:54

score 0 · Answer 1 · answered May 15 '23 at 23:08

I had a similar problem, and just ended up writing a small POSIX script1. The general idea is to preserve 'deeper' file; in order to do this, you reverse the sort order, and only delete files 'higher' than the current deepest. For posterity, this is the (current iteration of the) script;

#!/bin/sh
remove higher order duplicate files
e.g. ./a vs ./dir/a --> ./a is deleted
equal order duplicates will be ignored
usage: fdupes -ri dir/ | $0
depth() {
  echo $(echo "$1" | grep -o '/' | wc -l)
}
deepest=""
while IFS= read -r line; do
  if [ -z "$line" ]
  then 
    deepest=""
  else
    if [ $(depth "$line") -gt $(depth "$deepest") ]
    then
      # new deepest
      deepest="$line"
    else
      if [ $(depth "$line") -lt $(depth "$deepest") ]
      then
        rm "$line" 
      fi
    fi
    # ignore equal depth
  fi 
done

Given your example directory structure, I assume that this will be applicable, notwithstanding the title of your question.

Using fdupes to auto delete files but keep the second file and not the first

1 Answers1

remove higher order duplicate files

e.g. ./a vs ./dir/a --> ./a is deleted

equal order duplicates will be ignored

usage: fdupes -ri dir/ | $0