2

How to remove all files with a pattern except the recently created file

ls -l
total 655748
-rw-r--r-- 1 jetty jetty 120579643 May 10 19:59 2023_05_10.jetty.log
-rw-r--r-- 1 jetty jetty 115205809 May 11 19:59 2023_05_11.jetty.log
-rw-r--r-- 1 jetty jetty 102921116 May 12 19:59 2023_05_12.jetty.log
-rw-r--r-- 1 jetty jetty  85266768 May 13 19:59 2023_05_13.jetty.log
-rw-r--r-- 1 jetty jetty  97032182 May 14 19:59 2023_05_14.jetty.log
-rw-r--r-- 1 jetty jetty 117095164 May 15 19:59 2023_05_15.jetty.log
-rw-r--r-- 1 jetty jetty  33339025 May 16 04:13 2023_05_16.jetty.log

3 Answers3

2

Assuming you actually need the file creation time, you'd need Ubuntu 20.10 or newer, as before that GNU coreutils tools couldn't access the file creation time.

But on 20.10 and newer, you can use:

find some/directory -type f -exec stat --format '%W' {} \; -printf "%p\0" |
  sort -zn |
  sed -z 's/^[0-9.]*\n//; $d' |
  xargs -r0 echo rm --
  • The entire pipeline uses null-delimited data, which is the safest way of handling filenames
  • GNU find can't access the birth time on Linux yet, so we use stat to print the birth time, and then use find to print the filename with the null character to delimit it
  • Then we sort the data numerically (so based on the creation time)
  • Then we remove the birth time prefix leaving us with just the filenames, and also remove the last line, which would be the newest file
  • Then we use xargs with rm to delete all these files. The command above uses echo rm to show the commands that will be run — remove the echo to actually run rm and delete the files.
muru
  • 197,895
  • 55
  • 485
  • 740
1

This problem can be solved by utilizing the ls option -t which lists the files in the order of most recently modified. With the option -p we can exclude any subdirectories that may reside in the current directory.

With my testing, this would then look like:

[11:55:08] /home/jaska/src/foo/> ls -tp 
baz.log  bar/  bar.log  foo.log  foo  barbaz

Next we need to use grep to get rid of the unwanted directory:

[11:55:16] /home/jaska/src/foo/> ls -tp | grep -v '/$'                          
baz.log
bar.log
foo.log
foo
barbaz

Then we have to declare the pattern, again using grep to do so:

[11:55:28] /home/jaska/src/foo/> ls -tp | grep -v '/$' | grep \.log             
baz.log
bar.log
foo.log

Because they're in the correct order already, we use tail to drop a certain amount from the bottom up:

[11:55:21] /home/jaska/src/foo/> ls -tp | grep -v '/$' | grep \.log | tail -n +2
bar.log
foo.log

And then finally we delete the specified files:

[11:52:05] /home/jaska/src/foo/> ls -tp | grep -v '/$' | grep \.log | tail -n +2 | xargs -I {} rm -- {}
[12:01:59] /home/jaska/src/foo/> ll
total 4
drwxrwxr-x 2 jaska jaska 4096 May 16 11:53 bar/
-rw-rw-r-- 1 jaska jaska    0 May 16 11:48 barbaz
-rw-rw-r-- 1 jaska jaska    0 May 16 11:54 baz.log
-rw-rw-r-- 1 jaska jaska    0 May 16 11:50 foo

You may notice that the only remaining .log file is not listed in the latest output, as it was the most recently modified.

To recap, the full command to use is:

ls -tp | grep -v '/$' | grep \.log | tail -n +2 | xargs -I {} rm -- {}

You may want to tinker with the second grep part to fit just certain log files, but from the listing you gave us, that should be sufficient.

Credits go to, and full explanation to all parts can be found at: https://stackoverflow.com/questions/25785/delete-all-but-the-most-recent-x-files-in-bash

  • Nice use of ls -p to identify and later weed out any directories! I would not bother with grep \.log, though, and would instead do ls -tp *.log. It may also be pertinent to us ls -c rather than ls -t, to ensure the files are listed in order of creation time, rather than modification time.

    In OP's specific case, a shorter working version is ls -c *.log | tail +2 | xargs rm.

    – Jivan Pal May 16 '23 at 20:23
  • Very good recommendations, thanks! – Janne Jokitalo May 17 '23 at 12:14
1

Note # 1

What you actually see in the output of ls -l is the file's last modification date and not the creation date.

find has pattern matching(globbing) like -name "*jetty.log" and it has -mtime for days modified ... So, for example:

find . -mtime +0 -type f -name "*jetty.log"

To find matching files modified more than 24 hours ago ... 1 for 48 hours and so on.

And it has -mmin for minutes modified ... So, for example:

find . -mmin +10 -type f -name "*jetty.log"

To find matching files modified more than 10 minutes ago.

And it has the -delete action to delete matched files ... So, for example:

find . -mtime +9 -type f -name "*jetty.log" -delete

To find matching files modified more than 10 days ago and delete them.

You can restrict the search to the current folder's first level only by adding -maxdepth 1 to prevent matching files in sub-directories.

Note # 2

I noticed that your filenames begin with a date pattern e.g. 2923_05_10... ... If that is consistent and you want to match by those dates, then you can use pattern matching like, for example:

find . -maxdepth 1 -type f -name "2023_05_1[0-4].jetty.log"

To match files with 2023_05_10 to 2023_05_14 in their names.

As these date patterns in the filenames can be easily parsed into actual dates, you can also do that in bash like, for example:

#!/bin/bash

days=5 # Set the number of days from now for a date to be considered as recent

for f in .jetty.log do f1="${f%%.}" f1="${f1//_/-}" fd="$(date -d "$f1" '+%s')" td="$(date -d "- $days days" '+%s')" [ -f "$f" ] && [ "$fd" -lt "$td" ] && echo rm -- "$f" done

To match and print files with dates in their names older than 5 days ... If the output is what you want, then remove echo and re run it again to delete those files.

Note # 3

If what you mean is actually the creation/birth date of the file, then you can do that in bash with for example:

#!/bin/bash

days=5 # Set the number of days from now for a date to be considered recent

for f in *.jetty.log do cd="$(stat -c '%w' "$f")" fcd="$(date -d "$cd" '+%s')" td="$(date -d "- $days days" '+%s')" [ -f "$f" ] && [ "$fcd" -lt "$td" ] && echo rm -- "$f" done

To match and print files with creation date older than 5 days ... If the output is what you want, then remove echo and re run it again to delete those files.

This one, however, requires a recent kernel as the support for stat accessing date of birth was absent in older kernels ... Please see this answer for more explanation about this.

Note # 4

Simplify your life with bash functions to complement missing system-tools functionalities ... For example, this bash function:

function rm_old {

lfcd=0 for f in * do if [ -f "$f" ] then fcd="$(stat -c '%w' "$f")" fcd="$(date -d "$fcd" '+%s')" if [ "$fcd" -gt "$lfcd" ] then todelete+=("${tokeep[@]}") tokeep=("$f") lfcd="$fcd" elif [ "$fcd" -eq "$lfcd" ] then tokeep+=("$f") else todelete+=("$f") fi fi done

echo "These files will be DELETED:" printf '\e[4;31m%s\e[0m\n' "${todelete[@]}" echo "This/These file(s) will be KEPT:" printf '\e[4;32m%s\e[0m\n' "${tokeep[@]}" read -p "Confirm Y(Yes)/N(No)" answer case "$answer" in

Yes|yes|YES|Y|y) [ "${#todelete[@]}" -gt 0 ] && rm -- "${todelete[@]}" || echo "Exiting ..." ;; *) echo "Aborting ..." ;; esac

unset answer unset todelete unset tokeep unset fcd unset lfcd

}

Will keep only the last created file/files in the current working directory and delete the rest.

Raffa
  • 32,237