9

I bought a new 250GB Samsung 850 EVO SSD for my laptop that I want to use as primary storage device, together with the old but still functioning 250GB 7500 RPM HDD that I put in the former DVD bay with an adapter caddy.

Right now the HDD has only one big ext4 partition containing the OS, the applications and data files. I want to use the HDD for storing data, but I don't want to miss out on the opportunity to get the speed improvement of the SSD by doing so.

I want to combine say a 50GB or even smaller partition on the SSD and merge it with the partition on the HDD so that the least modified of the most accessed files are automatically moved to the SSD.

I've looked at caches like EnancheIO and Bcache, but they don't seem what I want, because (correct me if I'm wrong):

  • The space occupied by the cache partition is subtracted from the amount of space available.
  • The cache speeds up access to the most accessed files regardless of whether they're also the least often modified, which goes against the objective of not wanting to wear out the SSD.

Is the above correct, or could a cache (which one of those two?) help me reach my goal? If the above is correct, do you know of any other viable solution?

Would a union filesystem, like OverlayFS, be helpful here? If you monitored the HDD for the most accessed files (keeping track of their atime on a daily basis) and identified the least modified ones among them (keeping track of their mtime), in theory you could move those files to the SSD, freeing space on the HDD, while the union filesystem could make all that transparent to the user.

Would this work?

karel
  • 114,770
Fabio A.
  • 201
  • In theory your requirements are doable. The most feasible solution is to post an feature request on the bcache bug tracker asking for another strategy of storing files on the bcache. – Adam Ryczkowski Dec 10 '15 at 08:50
  • ...And yes, the size of bcache partition will be "eaten" away from the total available storage. It is by design, because bcache is file system agnostic. This design gives you a bonus of storing only most frequently changed part of a file, which should be advantageous with e.g. database data. – Adam Ryczkowski Dec 10 '15 at 08:51
  • Also check out the project, that may be easily modified to suit your needs (or may even support it out of the box): https://romanrm.net/mhddfs – Adam Ryczkowski Dec 10 '15 at 09:05
  • @Fabio: no comment, no acceptance, but you've been on-line. Any problems with the below answer? – Fabby Dec 13 '15 at 19:53
  • 1
    @Fabby, just now got the time to review the answers and comments, even though I was online I could't spend time doing this. I'll get to it now. – Fabio A. Dec 15 '15 at 09:07
  • @AdamRyczkowski, thanks for the pointer to mhddfs! It's actually a better working unionfs that could effectively be modified to do exactly what I need to do. If I get the needed time on my hands I could work on it. Thanks a lot. :) – Fabio A. Dec 15 '15 at 09:37
  • @FabioA. I look forward for it. IMO this type of file system should be default for every Linux distribution installed on boxes with both SSD and HDD. – Adam Ryczkowski Dec 15 '15 at 11:55

1 Answers1

3

You have a few options depending on what you're trying to accomplish:

  • Use bcache: You will be able to eat your cake, but not keep it.

    Yes, the amount of space you reserve for caching will be just like the opposite of a swap file: the amount you specify will be "taken away" from the total amount of disk space and given to the memory subsystem to be used as buffers for the other hard drive.
    To control which files do get cached, use something like vmtouch to fine-tune the bcache caching.

  • Use LVM: You will get to keep your cake, but not eat it.

    You can use the Logical Volume Manager to create a volume that contains both SSD and HDD creating a large /home volume that contains the space from both, but:

    1. You will have no control over which file goes on the SSD and which one on the HDD
    2. If you lose one of the two drives, you lose all data and will need to restore from back-up!!!
  • Use a manual system: You will be able to keep your cake, and eat it.

    Partition the drive into separate file systems: put / on the SSD and /home on the HDD. On top of this, you should put the all the files you want to go fast into /media/FastData and symlink the originals to the ones in /media/FastData if and only if these files reside in your /home (otherwise they already reside on the SSD anyway)

Note 1: I have a small SSD and a large HDD, so I use yet another system: / on the SSD and /home on the HDD and don't bother to optimise further...
Note 2: A union file system will not help you any more then the manual system...
Note 3: Here are some more tips not to wear out your SSD from bullet point 4 and onwards

Fabby
  • 34,259
  • 3
    As you've never accepted an answer on this site before: If this answer helped you, don't forget to click the grey at the left of this text, which means Yes, this answer is valid! ;-) – Fabby Dec 09 '15 at 17:44
  • Additional tips provided in an additional note! ;-) – Fabby Dec 10 '15 at 16:50
  • Thanks for the reply and the pointer to vmtouch, but - not to be rude - other than than this I see no added info over what I've stated myself into the question itself, is it there?

    The idea of using a union FS was that it would have then been transparent to the user as to where the files have been put - whether on the SSD or on the HDD - on the basis that the process of moving files from one media to the other would be totally automatic.

    Thinking of it, it could also be implemented with symlinks, which could prove to be easier to implement.

    – Fabio A. Dec 15 '15 at 09:16
  • All in all, it seems that there's no proper solution to my question, but I'll accept your reply as valid since you helped make it clear there's no proper solution as of yet.

    Thank you.

    – Fabio A. Dec 15 '15 at 09:16
  • @FabioA. Well, that depends on your definition of "proper". Symlinking does the trick "transparently" to the end user (once the administrator has set it up)... mhddfs has the same drawbacks of my lvm solution. Grazie mille for the acceptance though and favour returned: Q upvoted! ;-) – Fabby Dec 15 '15 at 12:32