3

I am trying to generate separate .md5 files for each .fastq file in a directory. I found solutions to generate a single .md5 file for many files, but this is not what I am looking for.

I have file1.fastq, file2.fastq, file3.fastq and want to generate file1.md5, file2.md5, file3.md5.

I assume that a FOR loop would do the trick but I am not a programmer and can't seem to find a solution to this problem.

I also tried the following code:

find . -type f -name "*.fastq.gz" -exec sh -c "md5sum < {} > {}.md5" \;

It correctly generates an .md5 file for each .fastq file, but the .md5 file contents are incorrect, i.e. I get 64399513b7d734ca90181b27a62134dc -

instead of 64399513b7d734ca90181b27a62134dc testfile.fastq

Can anyone help?

terdon
  • 100,812
Chris_bio
  • 63
  • 2
  • 8
  • Answers to http://askubuntu.com/questions/318530/generate-md5-checksum-for-all-files-in-a-directory post only explain how to generate a single .md5 file for many files – Chris_bio Jan 29 '16 at 11:04
  • 1
    Hang on, are your files *fastq or *fastq.gz? – terdon Jan 29 '16 at 11:09

2 Answers2

6

The simplest case is:

for file in *fastq; do md5sum "$file" > "$file".md5; done

That, however, will create file names like file1.fastq.md5. Extensions are basically irrelevant here, so that isn't a problem, but if you prefer file1.md5 do this instead:

for file in *fastq; do md5sum "$file" > "${file//.fastq}".md5; done
terdon
  • 100,812
3

Also your md5sum command doesn't print the filename because each file is read from the shell due to the < redirection, and md5sum can't possibly know the original filename, since it's feeded the file's content directly.

So if you want to process all the files in the current working directory recursively you can use this command instead:

find . -type f -name "*.fastq.gz" -exec sh -c "md5sum {} > {}.md5" \;
kos
  • 35,891