Excessive Writes on NVME drives - how to diagnose?

Question

One of our Samsung 2TB NVME SSDs recently failed, so we swapped it with a new stick and have started to pay careful attention to the SMART tests.

Here is the output from a drive that was installed less than two weeks ago:

root@~ $ smartctl -a /dev/nvme0n1p1
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-53-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Number:                       Samsung SSD 970 EVO Plus 2TB
Serial Number:                      S59CNZFNA02015F
Firmware Version:                   2B2QEXM7
PCI Vendor/Subsystem ID:            0x144d
IEEE OUI Identifier:                0x002538
Total NVM Capacity:                 2,000,398,934,016 [2.00 TB]
Unallocated NVM Capacity:           0
Controller ID:                      4
Number of Namespaces:               1
Namespace 1 Size/Capacity:          2,000,398,934,016 [2.00 TB]
Namespace 1 Utilization:            129,469,706,240 [129 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            002538 5a019ed120
Local Time is:                      Sun Nov 22 22:11:40 2020 EST
Firmware Updates (0x16):            3 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Maximum Data Transfer Size:         512 Pages
Warning  Comp. Temp. Threshold:     85 Celsius
Critical Comp. Temp. Threshold:     85 Celsius
Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     7.50W       -        -    0  0  0  0        0       0
 1 +     5.90W       -        -    1  1  1  1        0       0
 2 +     3.60W       -        -    2  2  2  2        0       0
 3 -   0.0700W       -        -    3  3  3  3      210    1200
 4 -   0.0050W       -        -    4  4  4  4     2000    8000
Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0
=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        42 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    14,723 [7.53 GB]
Data Units Written:                 4,508,008 [2.30 TB]
Host Read Commands:                 243,468
Host Write Commands:                176,596,876
Controller Busy Time:               1,060
Power Cycles:                       4
Power On Hours:                     205
Unsafe Shutdowns:                   3
Media and Data Integrity Errors:    0
Error Information Log Entries:      0
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               42 Celsius
Temperature Sensor 2:               46 Celsius
Error Information (NVMe Log 0x01, max 64 entries)
No Errors Logged

The part that has us concerned is this:

Data Units Written:                 4,508,008 [2.30 TB]

The lifetime is 250TB, so 2TB being used is insane, and it doesn't make any sense.

How do we go about trying to figure out why this number is so high?

Thanks!

========================

@heynnema thanks for following up! Here is the response to yours comments (fyi, I killed Ubuntu swap after installing the new SSD)

root@~ $ free -h
              total        used        free      shared  buff/cache   available
Mem:          251Gi        42Gi       153Gi       3.0Mi        56Gi       208Gi
Swap:            0B          0B          0B
root@~ $ sysctl vm.swappiness
vm.swappiness = 60
root@~ $ grep -i swap /etc/fstab
#/swap.img  none    swap    sw  0   0

======================== additional info:

I ran iotop as follows:

iotop -ao

and have this after running for a while:

Total DISK READ :       0.00 B/s | Total DISK WRITE :     147.34 K/s
Actual DISK READ:       0.00 B/s | Actual DISK WRITE:     357.38 K/s
  TID  PRIO  USER     DISK READ DISK WRITE>  SWAPIN      IO    COMMAND
29546 be/4 999           0.00 B    212.62 M  0.00 %  0.01 % mongod --auth --bind_ip_all [WTCheck.tThread]
  855 be/3 root          0.00 B    101.82 M  0.00 %  1.65 % [jbd2/nvme1n1p1-]
 1841 be/4 root          0.00 B     33.69 M  0.00 %  0.00 % python /opt/conda/bin/supervisord -c /etc/supervisor/supervisord.conf

It looks like the culprit is mongo and jbd2. How do I figure out what jbd2 is doing? thanks everyone for your help!

Are you using noatime in your mounts in fstab? cat /etc/fstab — oldfred, Nov 23 '20 at 03:56
As mentioned below, iotop might help. However, edit your question and show me free -h and sysctl vm.swappiness and grep -i swap /etc/fstab. Start comments to me with @heynnema or I'll miss them. — heynnema, Nov 23 '20 at 15:22

score 1 · Answer 1 · answered Nov 23 '20 at 03:24

You can check it with iotop. However, this will not show you the total writes to a drive, but it will allow you to see if apps are writing to the drive(s) a lot.

sudo apt install iotop

Then run it with elevated permissions:

sudo iotop

You should see something like the following:

Total DISK READ:         0.00 B/s | Total DISK WRITE:       248.20 K/s
Current DISK READ:       0.00 B/s | Current DISK WRITE:       0.00 B/s
    TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND        
1780425 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.37 % [kworker~e_power_]
   5170 be/4 terrance    0.00 B/s  248.20 K/s  0.00 %  0.00 % firefox ~orage #3]
      1 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % init nosplash
      2 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kthreadd]
      3 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [rcu_gp]
      4 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [rcu_par_gp]
      6 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kworker~-kblockd]
      8 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [mm_percpu_wq]
      9 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [ksoftirqd/0]
     10 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [rcu_sched]
     11 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [migration/0]
     12 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [idle_inject/0]
     14 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [cpuhp/0]
     15 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [cpuhp/1]
     16 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [idle_inject/1]
     17 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [migration/1]
     18 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [ksoftirqd/1]
     20 be/0 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kworker~-kblockd]
     21 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [cpuhp/2]
  keys:  any: refresh  q: quit  i: ionice  o: active  p: procs  a: accum        
  sort:  r: asc  left: SWAPIN  right: COMMAND  home: TID  end: COMMAND

Hope this helps!

score 0 · Answer 2 · answered Nov 23 '20 at 16:16

0

The important variable is Percentage used which is currently 0%. When it is 1% multiply the number of months since new to get life span.

See: How do I check system health?

answered Nov 23 '20 at 16:16

WinEunuuchs2Unix

102,282

thanks - we checked one of our other machines which runs a similar config, and the percentage used was 7%. That SSD is ~3 months old. We are trying to see if the docker containers are doing excessive writes, or whether it's mongodb, or something else. – vgoklani Nov 23 '20 at 16:59

Excessive Writes on NVME drives - how to diagnose?

2 Answers2