1

I have loads of files with file names that I need to simplify. I need to keep everything before the first _ , and the _R1/R2 bit.

A10_S65_L001_R1_001.fastq  
A8_S49_L001_R2_001.fastq   
B7_S42_L001_R1_001.fastq   
C5_S27_L001_R2_001.fastq   
F4_S22_L001_R1_001.fastq   
G2_S7_L001_R2_001.fastq    
H1_S165_L001_R1_001.fastq
A10_S65_L001_R2_001.fastq  
A9_S57_L001_R1_001.fastq   
B7_S42_L001_R2_001.fastq   
C6_S35_L001_R1_001.fastq   
F4_S22_L001_R2_001.fastq   
G3_S15_L001_R1_001.fastq   
H1_S165_L001_R2_001.fastq

So the first example would be ----> A10_R1.fastq

I've been able to use rename 's/*L001//' *L001*.fastq to remove parts of it but it gets complicated as the length of characters before the first _ varies. I'd really appreciate some help!

Thank you!

PhönixGeist
  • 103
  • 3
Buffy
  • 11

1 Answers1

1

This can be done in many ways ... I'll list three ways below for example.

First way

Using mmv:

mmv -n -- '*_*_*_*_*.fastq' '#1_#4.fastq'

'*_*_*_*_*.fastq' will work on all files with an extension of .fastq in the current working directory splitting the file name into parts(where * is) by the specified delimiter i.e. _ ... These parts are then called by number #(the first one being #1) to form the new filename i.e. '#1_#4.fastq' ... The -n option is for dry-run(simulation but no actual renaming is done) ... Remove -n when satisfied with the output to do the actual renaming.

Second way

Using rename:

rename  -n 's/^([^.]+)\_([^.]+)\_([^.]+)\_([^.]+)\_([^.]+)\.fastq$/$1_$4.fastq/' *.fastq

Please see this answer(The explanation part) for explanation on hwo it works and this link for regular expressions break down ... The -n option is for dry-run(simulation but no actual renaming is done) ... Remove -n when satisfied with the output to do the actual renaming.

Third way

Using mv with bash arrays in a bash for loop in a bash shell script file:

#!/bin/bash

for f in *.fastq; do # Work on files with ".fastq" extention in the current working directory assigning their namese one at a time(for each loop run) to the fariable "$f" IFS='.' read -r -a array <<< "$f" # Split filename into parts/elements by "" and "." and read the elements into an array f1="${array[0]}${array[3]}.${array[5]}" # Set the new filename in the variable "$f1" by sellecting certain array elements and adding back "" and "." echo mv -- "$f" "$f1" # Renaming dry-run(simulation) ... Remove "echo" when satisfied with output to do the actual renaming. done

or in a bash command string:

bash -c `for f in *.fastq; do IFS='_.' read -r -a array <<< "$f"; f1="${array[0]}_${array[3]}.${array[5]}"; echo mv -- "$f" "$f1"; done`
Raffa
  • 32,237
  • 1
    Thank you so much! mmv seems very straight forward but I don't have it installed and don't want to wait for my administrator, rename worked very well. Thanks again. – Buffy Jun 30 '22 at 09:22