Running Ubuntu 22.04 on my main laptop. I am using 4TB TEAMGROUP MP34 NVMe as my main drive. The file system is ext4
.
Yesterday (Nov 16), while downloading some large files (about 300 files, 600GB total), suddenly my laptop started acting weirdly. Everything became very slow and my system crashed. I was able to repair it with a bootable USB and fsck
. However the laptop was still very slow and the NVMe SSD was getting very hot, about 75 degrees Celsius (usually it's less than 35 degrees). The disk was only about 35% full. I run benchmark on the disk and the speeds were inconsistent and very slow. After several minutes of work the disk went to into read-only mode.
Initially, I thought there was some hardware problem. I opened the laptop and cleaned the contacts with isopropyl alcohol. I changed the NVMe with another and the laptop worked normally. I installed back my initial NVMe and the laptop was very slow again. At some point I decided to run sudo fstrim -av
, it took about 5-6 minutes (trimmed about 2.9TB
) and after that the laptop started working like new. I have been using it without any problems for more than 5 days now. I did some stress tests and benchmarks, everything works normally.
The output of the manual sudo fstrim -av
I did on Nov 16:
/boot/efi: 504.9 MiB (529436672 bytes) trimmed on /dev/nvme0n1p1
/: 2.9 TiB (3138692276224 bytes) trimmed on /dev/nvme0n1p2
It looks like fstrim.service
was working fine:
cat /var/log/syslog | grep -a fstrim
Nov 13 01:43:37 dev fstrim[98095]: /boot/efi: 504.9 MiB (529436672 bytes) trimmed on /dev/nvme0n1p1
Nov 13 01:43:37 dev fstrim[98095]: /: 2.9 TiB (3140636598272 bytes) trimmed on /dev/nvme0n1p2
Nov 13 01:43:37 dev systemd[1]: fstrim.service: Deactivated successfully.
The last TRIM looks more normal:
cat /var/log/syslog | grep -a fstrim
Nov 20 01:26:54 dev fstrim[109477]: /boot/efi: 504.9 MiB (529436672 bytes) trimmed on /dev/nvme0n1p1
Nov 20 01:26:54 dev fstrim[109477]: /: 31.5 GiB (33783455744 bytes) trimmed on /dev/nvme0n1p2
Nov 20 01:26:54 dev systemd[1]: fstrim.service: Deactivated successfully.
The NVMe is pretty new and in good condition:
sudo smartctl -a /dev/nvme0
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-6.2.0-36-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Number: TEAM TM8FP4004T
Serial Number: xxxxxxxxxxxxxxxxxxxxx
Firmware Version: VB421D65
PCI Vendor/Subsystem ID: 0x10ec
IEEE OUI Identifier: 0x00e04c
Controller ID: 1
NVMe Version: 1.3
Number of Namespaces: 1
Namespace 1 Size/Capacity: 4,096,805,658,624 [4.09 TB]
Namespace 1 Formatted LBA Size: 512
Local Time is: Fri Nov 17 12:57:17 2023 EET
Firmware Updates (0x02): 1 Slot
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x0014): DS_Mngmt Sav/Sel_Feat
Log Page Attributes (0x02): Cmd_Eff_Lg
Maximum Data Transfer Size: 32 Pages
Warning Comp. Temp. Threshold: 100 Celsius
Critical Comp. Temp. Threshold: 110 Celsius
Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 8.00W - - 0 0 0 0 230000 50000
1 + 4.00W - - 1 1 1 1 4000 50000
2 + 3.00W - - 2 2 2 2 4000 250000
3 - 0.50W - - 3 3 3 3 4000 8000
4 - 0.0090W - - 4 4 4 4 8000 30000
Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0
=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 35 Celsius
Available Spare: 100%
Available Spare Threshold: 32%
Percentage Used: 0%
Data Units Read: 4,447,105 [2.27 TB]
Data Units Written: 8,885,998 [4.54 TB]
Host Read Commands: 48,182,921
Host Write Commands: 112,476,615
Controller Busy Time: 0
Power Cycles: 34
Power On Hours: 2,423
Unsafe Shutdowns: 11
Media and Data Integrity Errors: 0
Error Information Log Entries: 0
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Error Information (NVMe Log 0x01, 8 of 8 entries)
No Errors Logged
Output of journalctl | grep "fstrim.*/:"
:
Jul 03 00:21:43 dev fstrim[27756]: /: 3.6 TiB (4009258434560 bytes) trimmed on /dev/nvme0n1p2
Jul 10 00:54:49 dev fstrim[1244594]: /: 3.6 TiB (4001406066688 bytes) trimmed on /dev/nvme0n1p2
Jul 17 00:32:58 dev fstrim[4040993]: /: 54.6 GiB (58677125120 bytes) trimmed on /dev/nvme0n1p2
Jul 24 00:29:14 dev fstrim[1600660]: /: 138.8 GiB (149000179712 bytes) trimmed on /dev/nvme0n1p2
Jul 31 00:35:13 dev fstrim[620323]: /: 135.8 GiB (145785393152 bytes) trimmed on /dev/nvme0n1p2
Aug 07 00:13:04 dev fstrim[35853]: /: 2.9 TiB (3226885373952 bytes) trimmed on /dev/nvme0n1p2
Aug 14 00:29:27 dev fstrim[125210]: /: 2.9 TiB (3230223196160 bytes) trimmed on /dev/nvme0n1p2
Aug 21 01:32:45 dev fstrim[332311]: /: 56.8 GiB (61013270528 bytes) trimmed on /dev/nvme0n1p2
Aug 28 00:11:05 dev fstrim[586592]: /: 90.3 GiB (96974286848 bytes) trimmed on /dev/nvme0n1p2
Sep 04 01:28:47 dev fstrim[16608]: /: 3 TiB (3257704198144 bytes) trimmed on /dev/nvme0n1p2
Sep 11 00:22:26 dev fstrim[21637]: /: 2.9 TiB (3238865485824 bytes) trimmed on /dev/nvme0n1p2
Sep 18 01:14:48 dev fstrim[126317]: /: 2.9 TiB (3240947859456 bytes) trimmed on /dev/nvme0n1p2
Sep 25 00:22:54 dev fstrim[410142]: /: 36.2 GiB (38895230976 bytes) trimmed on /dev/nvme0n1p2
Oct 02 00:31:31 dev fstrim[90432]: /: 3 TiB (3249296408576 bytes) trimmed on /dev/nvme0n1p2
Oct 09 00:48:51 dev fstrim[319128]: /: 54.2 GiB (58184278016 bytes) trimmed on /dev/nvme0n1p2
Oct 16 01:11:15 dev fstrim[29502]: /: 2.8 TiB (3103039946752 bytes) trimmed on /dev/nvme0n1p2
Oct 23 00:31:40 dev fstrim[85578]: /: 2.9 TiB (3152333541376 bytes) trimmed on /dev/nvme0n1p2
Oct 30 01:16:53 dev fstrim[212523]: /: 2.9 TiB (3140076969984 bytes) trimmed on /dev/nvme0n1p2
Nov 06 01:11:08 dev fstrim[38462]: /: 2.9 TiB (3138336178176 bytes) trimmed on /dev/nvme0n1p2
Nov 13 01:43:37 dev fstrim[98095]: /: 2.9 TiB (3140636598272 bytes) trimmed on /dev/nvme0n1p2
Nov 20 01:26:54 dev fstrim[109477]: /: 31.5 GiB (33783455744 bytes) trimmed on /dev/nvme0n1p2
Although an old question, this is related to the above numbers: Large amount of data trimmed after running fstrim. I don't restart my laptop very often and it's normal for me to have few weeks uptime.
I have been using SSDs for years and this is the first time I am experiencing a problem like this. Also the first time I had to execute fstrim
manually. So, I am a bit puzzled. What could have caused this behavior? Is it normal? Is there a way to know if my NVMe SSD needs TRIM?
smartctrl
I have only written about4.5TB
to the disk. I am using this NVMe since June, so it's about 5 months old. I am pretty sure, I didn't write2.9TB
between the November 6th and November 13th. Also, the problem I am describing, happened about 3 days after this big TRIM. On November 16th I did manualsudo fstrim -av
, and it trimmed another2.9TB
. btw I just updated my question with my last log from yesterday, it looks more normal. I am open to any ideas what could have caused this. – sotirov Nov 21 '23 at 09:04fstrim.service
suddenly trims an extremely large chunk of data, this could be an indication that something is amiss. I've never experienced exactly what you describe here - but I assume it must be some used blocks that haven't been properly cleaned up. Maybe an idea would be to make an alias that runs a manual trim, that you can run after downloading files of a certain size (say 25% of the disk). – Artur Meinild Nov 21 '23 at 11:19sudo dd if=/dev/zero of=benchmark.img bs=1G count=5 status=progress
. I'd say if it's less than half of disk specs, or lower then 250 MB/s, then do a trim. – Artur Meinild Nov 21 '23 at 11:25sudo fstrim -av
just now, only 73 GB was trimmed. – muru Nov 21 '23 at 11:38trim
tag for the past 5 years. The current read/write speeds are very close to what the manufacturer says they should be 3500MB/sec read and 2900MB/sec write. But when I had the problem, both read and write speeds were very inconsistent, sometime dropping to 1/10 of what they should be. – sotirov Nov 21 '23 at 11:55ext4
, not sure how relevant is this, but adding it to the question. – sotirov Nov 21 '23 at 12:01fstrim
command output looks nonsense to me. Thefstrim
service runs once per week on my system and *reports almost the same value each time*: ~250 GB for /home, ~200 GB for /, and 500 MB for /boot/efi. Obviously my / filesystem doesn't change so much (and it is almost 34GB full = 16%), my /home filesystem is also not so busy (and it is at most 240 MB full = 50%), and the EFI filesystem is almost static (and contains no more than 7MB of data = 2% full!) – FedKad Nov 21 '23 at 16:05