143

We all know that SSDs have a limited predetermined life span. So the question for me is how do I check in (Ubuntu) Linux what the current health status of my SSD is? And maybe an estimation how long it will take?

Graphical tool is preferred, but command line tool would also be fine.

I'm using Xubuntu 12.04 LTS

Seth
  • 58,122
keiki
  • 1,927
  • 2
    Can you add the output of smartctl -i /dev/sda to your question? – Mitch Jul 27 '13 at 13:43
  • 2
    @dschinn1001 Not exactly, that only applies to recent SSDs. First and Second generation SSD are known to have limited lifespan according to the amount of write operations to the disk. – João André Jul 27 '13 at 15:05
  • 2
    Coming from old school spinning drives, I used tools for testing HD that wrote and read all the disk a few times, which took a few hours. It seems that none of the tools mentioned use such an approach? Does such an approach not make sense for SSD? Well, then it seems that the SSD logs it's own experiences, and can then tell if it is ailing. Have I understood this correctly? – Mads Skjern May 27 '15 at 10:12
  • 2
    @MadsSkjern It's perfectly feasible to use a tool like badblocks to check the status of an SSD. There are however very good reasons to NOT do so. SSD's in my experience typically fail after exceeding a certain threshold of writes, so a destructive read write test such as can be performed with badblocks can actually shorten the life of the drive. – Elder Geek Nov 06 '18 at 19:27

9 Answers9

80

to check the health of a SSD

For Ubuntu, Mint, or Debian based distributions

# apt-get install smartmontools

The Media_Wearout_Indicator is what you are looking for. For 100 means your ssd has 100% life, the lower number means less life left.

# smartctl -a /dev/sda | grep Media_Wearout_Indicator

To show your sdd information

# smartctl -a /dev/sda

You can read the complete article at Nam Huy Linux Blog - How to check SSD life left on linux

Russ
  • 574
56

Install Gnome Disk Utility and check SMART Data and Tests for wear-leveling-count or similar. The higher that number (%, from 1 to 100), the more "used up" your SSD is, which means you are more likely to have problems. But if you have a recent SSD, you need not worry about it.

Installed via

 sudo apt-get install gnome-disk-utility

start it via

either menu->Settings->Disk utility

or via command line

sudo gnome-disks
João André
  • 1,300
  • 3
  • 17
  • 23
  • Do you mean "Gnome Disk Utility" ? – dschinn1001 Jul 27 '13 at 20:41
  • 2
    Yes, I wasn't sure about the name because they changed it in 12.10 – João André Jul 29 '13 at 00:51
  • 8
    palimpsest is not recognized by Ubuntu 14.04, although gnome-disk-utility is installed. I also don't see a disk utility in settings (gear icon). palimpsest is an awful name, does the name vary with user language (e.g. english vs. something else). – Paul May 09 '15 at 17:59
  • 1
    @Paul "Palimpsest" is an English word referring to a page of text that has been scraped or wiped clean and written over. So it's an appropriate name, although a bit obscure. – augurar Aug 25 '15 at 23:18
  • 11
    as a note to readers; palimpset is renamed to gnome-disks ( as mentioned http://askubuntu.com/a/623306/4580 ). – immeëmosol Oct 29 '15 at 17:18
  • 3
    Why do you say "If you have a recent SSD, you need not worry about it"? – jfa Mar 22 '16 at 20:22
  • @JFA, because current SSDs allow a very high amount of data to be written on it (around 60GB daily (!) for 8 years straight for a 500GB SSD, a 1TB SSD doubles that) before it dies. You should worry only if you benchmark it every day like crazy – kit Sep 13 '18 at 07:10
  • 3
    You got this backwards. The wear number starts at 100 and decreases with usage, so the higher the number, the LESS used is the SSD, see here: https://superuser.com/questions/1037644/samsung-ssd-wear-leveling-count-meaning – Logix Nov 29 '18 at 12:54
  • How can it be run without monitor (using ssh)? – ch271828n May 05 '20 at 09:39
55

If you don't have an Intel-brand SSD: READ THIS.

Watch out !! -- I was blithely mislead by 'smartmontools.' I have a Samsung SSD, and the smartmonitor/'smartctl' tool happily misreported that '233' (hex 'E9') attribute was 'Media_Wearout_Indicator'; in fact -- no, for Samsung (and other manufacturers) it is up to entirely different. This and other forum postings, stack-exchange question/answers, and power-user blogs I found seem to be 'Intel focused,' with only vague hints that 'it may vary.' (Versus any suggestion that you need to watch out for wrong and erroneous labeling of the attribute by smartmontools).

As I was preparing to copy my SSD to a new harddrive I'd bought (because of what smartmontools had told me), I booted to windows (I have a dual boot system), to learn something about SSD's from what the windows-only Samsung tool 'Samsung_Magician_v43.exe' had to tell me about my drive -- it was shockingly uninformative.

After what's been hours of digging - I've finally been able to run the windows only tools: hddgaurdian, and then also CrystalDiskInfo: Surprise! both tools independently tell me my Samsung SSD is 'just fine' (hdd guardian says '5 stars' and Crystal Disk "98% OK"). By contrast the smartctl tool explicitly labeled the attribute with 'decimal- 233 / 'hex- E9' as "Media Wearout Indicator" -- and told me its value was "1" or 1% -- an indicator of (the risk of) pending failure. To be as sure as I can, I dug and dug and was finally able to locate at least something from Samsung official: Samsung White Paper 07: Communicating With Your SSD [archive.org]

The document indeed implies that the attribute 'hex E9' /'decimal '233' is not used by Samsung the same way. ( Samsung: I'm very disappointed, please either fix your official software-tool, or at least make it clear that you do not provide wear out indication information!)

Further - if you have neither an Intel SSD nor Samsung SSD - be warned, this info does seem to vary across manufacturers. ( e.g. see the attribute label chart on https://code.google.com/p/hddguardian/wiki/about_reliability for the only useful indication of the degree of variability that I found. )

The so-what: If you don't have an Intel SSD-- do not be mislead by the false attribute name labels provided by smartmonitor. Perhaps it will improve in the future, but the version installed by default for Ubuntu 12.04 LTS (April, 2014) was total fail. Instead of telling you it 'doesn't know' -- smartctl just mislabeled the attribute. I did not find another tool for linux that made the 'correct' information transparent or clear.

mwfearnley
  • 3,317
  • 2
  • 22
  • 28
Matt S.
  • 773
  • 2
    Props for including the link to Samsungs documentation of their SMART attributes. I have no idea what those other applications you mentioned are or how useful they are, but I would strongly recommend you simply keep an eye on the Attribute #5 "Reallocated Sector Count" as this will be a good indicator of how close your SSD is to failure, as once it runs out of spare sectors it has to use to replace the ones that go bad then you'll be nearing EOL on your SSD – Maks Jul 18 '14 at 07:01
  • 4
    The PDF can now be found at http://www.samsung.com/global/business/semiconductor/minisite/SSD/M2M/download/07_Communicating_With_Your_SSD.pdf – Force May 22 '17 at 21:40
26

For (at least some) NVMe drives, you can do

smartctl -a /dev/nvme0

You can then look for a line like:

Percentage Used:                    5%

Here lower numbers are better and 100% means the drive is "worn out". Manufacturer documentation suggests that it is possible to get numbers above 100% if you keep using the drive beyond this point (example from Seagate, see page 12).

Note that if you use the namespace or partition devices, like /dev/nvme0n1 or /dev/nvme0n1p1, it won't work and you will instead get a message like Read NVMe SMART/Health Information failed: NVMe Status 0x4002.

Nate Eldredge
  • 844
  • 9
  • 20
  • Strange... I got "percentage used: 7%" however I also got the "Storage Device Problems" The storage device HS-SSD-C2000 (/dev/nvme0n1) is likely to fail soon! – erwin Jul 28 '21 at 12:20
17

For Kingston drives on Debian-based computers

Similar to this answer execute

# apt-get install smartmontools

However when I execute the command to show the drive info, it looks like SMART was disabled:

# smartctl -a /dev/sda 
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-45-generic] (local build)
[ ... ]
SMART support is: Available - device has SMART capability.
SMART support is: Disabled

You need to enable that by executing the following as root:

# smartctl -s on -a /dev/sda

You can then execute a self-test by doing either a short test (which took me about 1 minute):

# smartctl -t short -a /dev/sda

or a more thorough test (which took me about 1.5 hours):

# smartctl -t long -a /dev/sda

Note, in most circumstances you do not need to unmount the drive to execute these tests. If you do, see man smartctl.

Now, when you execute smartctl -a /dev/sda you should then see a self-assessment test result. This is probably all you really need to concern yourself with:

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

If you like details, you will also see a table like this:

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0032   095   095   050    Old_age   Always       -       0/178007034
  5 Retired_Block_Count     0x0033   100   100   003    Pre-fail  Always       -       0
  9 Power_On_Hours_and_Msec 0x0032   092   092   000    Old_age   Always       -       7626h+46m+45.580s
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       8
171 Program_Fail_Count      0x000a   100   100   000    Old_age   Always       -       0
172 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0
174 Unexpect_Power_Loss_Ct  0x0030   000   000   000    Old_age   Offline      -       4
177 Wear_Range_Delta        0x0000   000   000   000    Old_age   Offline      -       1
181 Program_Fail_Count      0x000a   100   100   000    Old_age   Always       -       0
182 Erase_Fail_Count        0x0032   100   100   000    Old_age   Always       -       0
187 Reported_Uncorrect      0x0012   100   100   000    Old_age   Always       -       0
189 Airflow_Temperature_Cel 0x0000   030   035   000    Old_age   Offline      -       30 (Min/Max 24/35)
194 Temperature_Celsius     0x0022   030   035   000    Old_age   Always       -       30 (Min/Max 24/35)
195 ECC_Uncorr_Error_Count  0x001c   120   120   000    Old_age   Offline      -       0/178007034
196 Reallocated_Event_Count 0x0033   100   100   003    Pre-fail  Always       -       0
201 Unc_Soft_Read_Err_Rate  0x001c   120   120   000    Old_age   Offline      -       0/178007034
204 Soft_ECC_Correct_Rate   0x001c   120   120   000    Old_age   Offline      -       0/178007034
230 Life_Curve_Status       0x0013   100   100   000    Pre-fail  Always       -       100
231 SSD_Life_Left           0x0013   100   100   010    Pre-fail  Always       -       0
233 SandForce_Internal      0x0032   000   000   000    Old_age   Always       -       3498
234 SandForce_Internal      0x0032   000   000   000    Old_age   Always       -       2885
241 Lifetime_Writes_GiB     0x0032   000   000   000    Old_age   Always       -       2885
242 Lifetime_Reads_GiB      0x0032   000   000   000    Old_age   Always       -       868

If you are looking for what all of these values mean, see the Kingston documentation.

Mike
  • 693
  • If you're going to downvote, at least leave a comment... – Mike Nov 24 '15 at 19:31
  • 2
    Not all Kingston SSDs support them all. Some that don't (like my UV400) seem to show random numbers in those fields they do not support. – otus Sep 01 '16 at 12:04
  • the answer is missing some more useful hints about the smart details. Just to make sure the reader is appropriately reading the resulting table of values. In short, it seems to me that the SSD_Life_Left value is the most straightforward indicator. If 100, brand new ssd, if 1, a walking dead ssd. – mh-cbon Dec 04 '18 at 10:33
  • @mh-cbon Since the answer is already relatively wordy, I simply linked to the Kingston documentation since there is a LOT of details there. However if you feel you can improve the answer, feel free to edit it. – Mike Dec 04 '18 at 19:02
7

Wear_Leveling_Count is the right attribute to track. However, like the other attributes, 100 is the BEST value and 0 is the WORST. Think of it as "percent life remaining".

Jim Van Zandt
  • 99
  • 1
  • 3
5

The best way to check the health of an SSD is to follow the manufacturers recommendations for doing so. As these vary from manufacturer to manufacturer and may change over time, it's a good idea to check with your drives manufacturer if you have concerns. Based on MTBF ratings (the JEDEC JESD218A standard defines the method) provided by most manufacturers an SSD should last well over a million hours without a problem.

I have several of these covering several manufacturers. I can guarantee that the SMART attributes vary between manufacturers. For comparison purposes here's an example from OCZ and smart data from a Corsair F40 unit along with a discussion regarding how unreliable this data is.

While SMART data can certainly have value, since all devices fail eventually, the important thing is that you back up your data regularly. This provides peace of mind that your data is safe while you wait (likely for several years) for your SSD to fail. As costs drop and capacities rise, it's more likely that you'll replace an SSD due to space contraints than to failure. (In my experience 10x more likely). I would simply backup regularly and not worry about it.

Sources:

Experience, http://www.hardcoreware.net/mtbf-ssd-what-does-it-mean-for-you/

Elder Geek
  • 36,023
  • 25
  • 98
  • 183
1

For my SSD drive (hdparm prints Model Number: CT480BX500SSD1) the parameter name was Percent_Lifetime_Remain, i.e.

$ sudo smartctl -a /dev/sda | grep Percent_Lifetime_Remain

has showed:

202 Percent_Lifetime_Remain 0x0030   098   098   001    Old_age   Offline      -       2

I'm using this system for ~4 months, quite actively (backend software development), and I've got 2% off the lifetime so far. Maybe I should think of better SSD.

ivan.ukr
  • 391
  • 4
  • 12
0

Ubuntu 22.04 here, drive is KINGSTON SA400S37240G (S3E00101) (as reported by GnomeDisks).

The health of my SSD is shown by it's the temperature reading. When I've just installed it into my computer, the temperature was 100ºC, with time this temperature is lowering (now it's 88ºC). In smartmon, this value is shown as the wear of the SSD.

anonymous2
  • 4,298