Brand new fully-optioned Alienware M15R7, dual SSDs, Windows 11 Home on disk0, Ubuntu LTS on disk1 Samsung 980 Pro SSD. LUKS and LVM configured in tandem on Ubuntu (disk1). Everything has worked great after initial system install. HOWEVER, on Ubuntu (and MX-Linux before) my drive intermittently, once a week or so, locks up, as if the LUKS key had vanished and cannot read or write to the SSD. I receive errors like, "cannot read superblock 10111001", or "disk not writable", that sort of thing. SNAP application processes, such as Firefox die. The system becomes unusable and my only options seems to be to hard shutdown by pressing physical on/off button, pouring myself a drink, and returning a bit later to turn it back on. Things work normal for a week or so, and then the issue returns again. Kind of annoying. Especially since everything else works so great.
About the drive: Samsung 980 Pro, installed carefully by myself. The drive has a Samsung software utility package called Magic, but the iso was made only for Windows. The magic software is supposed to help you tweak settings, verify functionality, troubleshoot problems with the disk, etc. Since this Magic utility software does not run on Linux I enabled both disks in my BIOS, switched to Windows 11 and ran the Samsung Magic utility from there. Confirmed that the firmware was the latest version and every "passive" test was marked as "passed". Since the drive was encrypted with LUKS none of the read or write tests could function, obviously. One thing I just thought of is, perhaps I could try loading the Magic software on Ubuntu with Wine or some equivalent to see if the utility can take a peek into the internals of the SSD while the drive is unencrypted.
I did install GSmartControl on Ubuntu, and I noticed I was not able to enable SMART or run any tests. The following output details were give for my device:
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-46-generic] (local build) Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION === Model Number: Samsung SSD 980 PRO 2TB Serial Number: S6B0NS0T405635B Firmware Version: 5B2QGXA7 PCI Vendor/Subsystem ID: 0x144d IEEE OUI Identifier: 0x002538 Total NVM Capacity: 2,000,398,934,016 [2.00 TB] Unallocated NVM Capacity: 0 Controller ID: 6 NVMe Version: 1.3 Number of Namespaces: 1 Namespace 1 Size/Capacity: 2,000,398,934,016 [2.00 TB] Namespace 1 Utilization: 62,475,145,216 [62.4 GB] Namespace 1 Formatted LBA Size: 512 Namespace 1 IEEE EUI-64: 002538 b4214024ee Local Time is: Fri Aug 12 23:36:03 2022 MST Firmware Updates (0x16): 3 Slots, no Reset required Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test Optional NVM Commands (0x0057): Comp Wr_Unc DS_Mngmt Sav/Sel_Feat Timestmp Log Page Attributes (0x0f): S/H_per_NS Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Maximum Data Transfer Size: 128 Pages Warning Comp. Temp. Threshold: 82 Celsius Critical Comp. Temp. Threshold: 85 Celsius
Supported Power States St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat 0 + 8.49W - - 0 0 0 0 0 0 1 + 4.48W - - 1 1 1 1 0 200 2 + 3.18W - - 2 2 2 2 0 1000 3 - 0.0400W - - 3 3 3 3 2000 1200 4 - 0.0050W - - 4 4 4 4 500 9500
Supported LBA Sizes (NSID 0x1) Id Fmt Data Metadt Rel_Perf 0 + 512 0 0
=== START OF SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED
So that's everything I have done. I'm not sure what the issue is, exactly. Any help would be appreciated. Thank you.
sudo nvme list
orudisksctl status
to see current revision. – oldfred Aug 13 '22 at 13:19sudo e2fsck -f /dev/ubuntu-vg/root
– oldfred Aug 14 '22 at 15:19