16

I need to rename approx. 70,000 files. For example: From sb_606_HBO_DPM_0089000 to sb_606_dpm_0089000 etc.

The number range goes from 0089000 to 0163022. It's only the first part of the name that needs to change. all the files are in a single directory, and are numbered sequentially (an image sequence). The numbers must remain unchanged.

When I try this in bash it grizzles at me that the 'Argument list is too long'.

Edit:

I first tried renaming a single file with mv:

mv sb_606_HBO_DPM_0089000.dpx sb_606_dpm_0089000.dpx

Then I tried renaming a range (I learned here last week how to move a load of files, so I thought the same syntax might work for renaming the files...). I think I tried the following (or something like it):

mv sb_606_HBO_DPM_0{089000..163023}.dpx sb_606_dpm_0{089000..163023}.dpx
Zanna
  • 70,465
rich
  • 179
  • 4
    To reviewers: I don’t think this is a duplicate; most of the CLI answers on the other question won’t work here because of the large number of files colliding with the shell’s ARG_MAX limit. As this question explicitly asks for a command-line solution, (possibly equal) GUI solutions as in the other question also don’t match. – dessert May 20 '18 at 11:54
  • 1
    I don't think this is a dupe because it's ok to have more than one question about renaming files. Please let's not close specific questions against generic resources that don't actually answer them... – Zanna May 20 '18 at 16:51
  • Hi all, this is certainly not a deliberate dupe! I did search for an answer to my problem before posting the question. I have to use Linux as an OS at work for video compositing in specific software, so command line stuff doesn't come naturally to me. May I humbly suggest that if a question is a known duplicate, we simply point the questioner in the right direction. My knowledge of command line is very limited, so all the help given here is very much appreciated - thank you all for your answers, I have learned a lot! – rich May 20 '18 at 18:19
  • 1
    @rich If you can edit in explicitly what command you tried, then it would be clearer that this isn't a dupe. (This shows us that you are aware of this approach.) Cheers. – Sparhawk May 21 '18 at 05:46
  • @Sparhawk OK that's done - hope that helps. Again, my knowledge is very limited, so learning the hard way here. – rich May 21 '18 at 10:10
  • 2
    rich, your question isn't a dupe, because it's a specific question. Don't worry about that. More importantly, after a question has received a number of upvoted answers, editing it is probably not a good idea because your edits might make the existing answers less valid. Now I feel like my answer should explain why mv {1..2} {3..4} does not work, which is a whole different problem from ARG_MAX... Everyone else who answered will probably feel the same! So, from my point of view, I wish you would rollback your last edit and, if you want to, ask a whole new question about mving with ranges – Zanna May 22 '18 at 10:16
  • 1
    @Sparhawk the OP wrote quite clearly, from the first version of the question, that the problem is the argument list too long error. There is no need to clarify further, this is clearly not a dupe since we need a workaround for dealing with ARG_MAX and the answers in the proposed duplicate don't do that. – terdon May 22 '18 at 11:01
  • @terdon Yes, I understand that it's not a dupe, but someone else previously voted to close it as such, so this additional information should prevent that. It's also helpful to see what the questioner has already attempted. – Sparhawk May 22 '18 at 12:27

9 Answers9

26

One way is to use find with -exec, and the + option. This constructs an argument list, but breaks the list into as many calls as needed to operate on all the files without exceeding the maximum argument list. It is suitable when all arguments will be treated the same. This is the case with rename, though not with mv.

You may need to install Perl rename:

sudo apt install rename

Then you can use, for example:

find . -maxdepth 1 -exec rename -n 's/_HBO_DPM_/_dpm_/' {} +

Remove -n after testing, to actually rename the files.

Zanna
  • 70,465
11

I'm going to suggest three alternatives. Each is a simple single line command, but I'll provide variants for more complicated cases, mainly in case the files to process are mixed with other files in the same directrory.

mmv

I'd use the mmv command from the package of the same name:

mmv '*HBO_DPM*' '#1dpm#2'

Note that the arguments are passed as strings, so the glob expansion does not happen in the shell. The command receives exactly two arguments, and then finds corresponding files internally, without tight limits on the number of files. Also note that the command above assumes that all the files which match the first glob shall be renamed. Of course you are free to be more specific:

mmv 'sb_606_HBO_DPM_*' 'sb_606_dpm_#1'

If you have files outside the requested number range in the same directory, you might be better off with the loop over numbers given further down in this answer. However you could also use a sequence of mmv invocations with suitable patterns:

mmv 'sb_606_HBO_DPM_0089*'       'sb_606_dpm_0089#1'    # 0089000-0089999
mmv 'sb_606_HBO_DPM_009*'        'sb_606_dpm_009#1'     # 0090000-0099999
mmv 'sb_606_HBO_DPM_01[0-5]*'    'sb_606_dpm_01#1#2'    # 0100000-0159999
mmv 'sb_606_HBO_DPM_016[0-2]*'   'sb_606_dpm_016#1#2'   # 0160000-0162999
mmv 'sb_606_HBO_DPM_01630[01]?'  'sb_606_dpm_01630#1#2' # 0163000-0163019
mmv 'sb_606_HBO_DPM_016302[0-2]' 'sb_606_dpm_016302#1'  # 0163020-0163022

loop over numbers

If you want to avoid installing anything, or need to select by number range avoiding matches outside this range, and you are prepared to wait for 74,023 command invocations, you could use a plain bash loop:

for i in {0089000..0163022}; do mv sb_606_HBO_DPM_$i sb_606_dpm_$i; done

This works particularly well here since there are no gaps in the sequence. Otherwise you might want to check whether the source file actually exists.

for i in {0089000..0163022}; do
  test -e sb_606_HBO_DPM_$i && mv sb_606_HBO_DPM_$i sb_606_dpm_$i
done

Note that in contrast to for ((i=89000; i<=163022; ++i)) the brace expansion does handle leading zeros since some Bash release a couple of years ago. Actually a change I requested, so I'm happy to see use cases for it.

Further reading: Brace Expansion in the Bash info pages, particularly the part about {x..y[..incr]}.

loop over files

Another option would be to loop over a suitable glob, instead of just looping over the integer range in question. Something like this:

for i in *HBO_DPM*; do mv "$i" "${i/HBO_DPM/dpm}"; done

Again this is one mv invocation per file. And again the loop is over a long list of elements, but the whole list is not passed as an argument to a subprocess, but handled internally by bash, so the limit won't cause you problems.

Further reading: Shell Parameter Expansion in the Bash info pages, documenting ${parameter/pattern/string} among others.

If you wanted to restrict the number range to the one you provided, you could add a check for that:

for i in sb_606_HBO_DPM_+([0-9]); do
  if [[ "${i##*_*(0)}" -ge 89000 ]] && [[ "${i##*_*(0)}" -le 163022 ]]; then
    mv "$i" "${i/HBO_DPM/dpm}"
  fi
done

Here ${i##pattern} removes the longest prefix matching pattern from $i. That longest prefix is defined as anything, then an underscore, then zero or more zeros. The latter is written as *(0) which is an extended glob pattern that depends on the extglob option being set. Removing leading zeros is important to treat the number as base 10 not base 8. The +([0-9]) in the loop argument is another extended glob, matching one or more digits, just in case you have files there that start the same but don't end in a number.

MvG
  • 1,506
  • Thank-you! This worked like a dream:for i in {0089000..0163022}; do mv sb_606_HBO_DPM_$i sb_606_dpm_$i; done - I had to add the filename extension to get it to work, but it did it just what I wanted and I even understand the syntax. Thank-you @MvG – rich May 21 '18 at 17:21
  • @rich: Happy I could help – you and hopefully future visitors as well. Don't forget to accept the most useful answer. You can always change that check mark in the future if something better comes along. – MvG May 21 '18 at 18:25
10

One way to work around the ARG_MAX limit is to use the bash shell's builtin printf:

printf '%s\0' sb_* | xargs -0 rename -n 's/HBO_DPM/dpm/'

Ex.

rename -n 's/HBO_DPM/dpm/' sb_*
bash: /usr/bin/rename: Argument list too long

but

printf '%s\0' sb_* | xargs -0 rename -n 's/HBO_DPM/dpm/'
rename(sb_606_HBO_DPM_0089000, sb_606_dpm_0089000)
.
.
.
rename(sb_606_HBO_DPM_0163022, sb_606_dpm_0163022)
steeldriver
  • 136,215
  • 21
  • 243
  • 336
7
find . -type f -exec bash -c 'echo $1 ${1/HBO_DPM/dpm}' _ {} \;
./sb_606_HBO_DPM_0089000 ./sb_606_dpm_0089000

find in current directory . for all the files -type f and do rename the file found $1 with replacing HBO_DPM with dmp one by one -exec ... \;

replace echo with mv to perform rename.

Zanna
  • 70,465
αғsнιη
  • 35,660
6

You can do it file by file (it may take some time) with

sudo apt install util-linux  # if you don't have it already
for i in *; do rename.ul HBO_DPM dpm "$i"; done

Like the Perl rename used in other answers, rename.ul has also an option -n or --no-act for testing.

Zanna
  • 70,465
muclux
  • 5,154
  • I have edited out your comment about Zanna's answer, please edit Zanna's answer or leave a comment. – fosslinux May 24 '18 at 22:30
  • @ubashu that wasn't a comment on my answer - it was referring to the -n flag I used for testing and suggesting it can be used in rename.ul too. – Zanna Jun 03 '18 at 12:28
6

You could write a little python script, something like:

import os
for file in os.listdir("."):
    os.rename(file, file.replace("HBO_DPM", "dpm"))

Save that as a text file as rename.py in the folder the files are in, then with the terminal in that folder go:

python rename.py
dessert
  • 39,982
3

I see that nobody has invited my best friend sed to the party :). The following for loop will accomplish your goal:

for i in sb_606_HBO_DPM*; do
  mv "$i" "$(echo $i | sed 's/HBO_DPM/dpm/')";
done

There are many tools for such a job, select the one that is most understandable for you. This one is simple and easily altered to suit this or other purposes...

andrew.46
  • 38,003
  • 27
  • 156
  • 232
  • Granted, not very relevant in this specific case, but this will fail if any of the file names contain newlines. I mention this since most (all?) other answers are robust and can deal with arbitrary file names, or only work on the OP's file naming scheme. – terdon May 22 '18 at 10:26
  • ... newlines, spaces, wildcards, ... some of which can be avoided by quoting $i in the command substitution, but no easy way to handle a trailing newline in the filename. – muru May 22 '18 at 12:27
3

Since we're giving options, here's a Perl approach. cd into the target directory and run:

perl -e 'foreach(glob("sb_*")){rename $_, s/_HBO_DPM_/_dpm_/r}'

Explanation

  • perl -e : run the script given by -e.
  • foreach(glob){} : run whatever is in the { } on each result of the glob.
  • glob("sb_*") : return a list of all files and directories in the current directory whose names match the shell glob sb*.
  • rename $_, s/_HBO_DPM_/_dpm_/r : perl magic. $_ is a special variable that holds each element we are iterating over (in the foreach). So here, it will be each file found. s/_HBO_DPM_/_dpm_/ replaces the first occurrence of _HBO_DPM_ with _dpm_. It runs on $_ by default, so it will run on each file name. The /r means "apply this replacement to a copy of the target string (the file name) and return the modified string. rename does what you'd expect: it renames files. So the whole thing will rename the current file name ($_) to itself with _HBO_DPM_ replaced by _dpm_.

You could write the same thing as an expanded (and more readable script):

#! /usr/bin/env perl
use strict;
use warnings;

foreach my $fileName (glob("sb_*")){
  ## Copy the name to a new variable
  my $newName = $fileName;
  ## change the copy. $newName is now the changed version
  $newName =~ s/_HBO_DPM_/_dpm_/;
  ## rename
  rename $fileName, $newName;
}
terdon
  • 100,812
1

Depending on the kind of renaming you're envisioning, using vidir with multiple lines editing may be satisfactory.
In your particular case you could select all lines in your text editor and remove the _"HBO" part of filenames in few keystrokes.

kraymer
  • 111
  • 4