Counting files in a directory

Question

I use the following code at the end of one of my scripts to tally up the number of files I have processed and moved into that directory.

# Report on Current Status
echo -n "Cropped Files: "
ls "${Destination}" | wc -l

My problem lies with how I handle duplicate files. As of right now, I check for the file's presence first (as my script is destructive in nature to the source files I am processing). If it senses a file of that name already processed, I alter the filename as follows.

Duplicate file: foo.pdf

Changed name: foo.x.pdf

If there is a foo.x.pdf, then I rename again to foo.xx.pdf. Repeat as necessary. I intend to go in later and evaluate each 'version' and select the best one to keep on hand. But herein lies my problem. I would like to count the number of files that do not contain .x. .xx. and so on. How do I strip these out of the ls output so wc -l can count the unique files only?

TL;DR: How do I get the count of files in a given directory that do not contain a given substring in their filename?

John1024 · Accepted Answer · 2018-02-21T18:22:31.027

9

To find the number of files in a directory that do not contain .x.pdf, try:

find "${Destination}" -mindepth 1 ! -name '*.x.pdf' -printf '1' | wc -c

To find the number of files in a directory that do not contain period - one or more x - period - pdf, try:

find "${Destination}" -mindepth 1 ! -regex '.*\.x+\.pdf' -printf '1' | wc -c

The above search recursively through subdirectories. If you don't want that, add the option -maxdepth 1. For example:

find "${Destination}" -mindepth 1 -maxdepth 1 ! -regex '.*\.x+\.pdf' -printf '1' | wc -c

Note that because we use -printf '1', this method is safe even if the directory contains files whose names contain newline characters.

edited Feb 21 '18 at 18:22

answered Feb 09 '18 at 21:13

John1024

13,687
43
51

1

Altered your second example and tested it. Works! Thank you. find "${Destination}" -mindepth 1 ! -regex '.*.x+.pdf' -printf '1\n' | wc -l – Aaron Nichols Feb 09 '18 at 21:17
@DavidFoerster Yes, that does seem simpler. Answer updated to eliminate \n. Thanks. – John1024 Feb 21 '18 at 18:23

score 2 · Answer 2 · answered Feb 09 '18 at 23:50

2

Without subdirectories:

echo $(($(for file in *.sh ; do echo -n 1+; done; echo 0;)))

because:

for file in *.sh ; do echo -n 1+; done; echo 0;
1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+0

answered Feb 09 '18 at 23:50

user unknown

6,507

1

I see how this counts files in a directory but how does it avoid counting the files that the OP doesn't want to count: "But herein lies my problem. I would like to count the number of files that do not contain .x. .xx. and so on"? – John1024 Feb 10 '18 at 00:31
1

@John1024: Count all files, count all files with .x*.pdf, subtract. – user unknown Feb 10 '18 at 00:51
user-unknown, OK. Very good. – John1024 Feb 10 '18 at 01:46

pa4080 · Answer 3 · 2018-02-10T08:48:06.757

0

You can exclude a file or files that match to a pattern from the ls command by using (one or more times) the option -I, --ignore=PATTERN (reference):

ls -I "*.x*.pdf" "${Destination}" | wc -l

Or you could use the subtraction method in this way:

echo $(($(ls "${Destination}" | wc -l) - $(ls "${Destination}"/*.x*.pdf | wc -l)))

edited Feb 10 '18 at 08:48

answered Feb 10 '18 at 08:33

pa4080

29,831

Counting files in a directory

3 Answers3