0

My ubuntu freezes and then shows a black screen with looping error texts:

systemd-journald: failed to write entry, ignoring: read-only file system

ext4-fs error device nvme0n1p2 ext4_find_entry:1454: inode #22152700: comm gdm-seesion-wor: reading directory lblock 0

What is this about and how can I solve this?

EDIT: I also had the error Buffer I/0 error on device nvme0n1p2

EDIT 2: I was able to boot once and to test the disc sanity with smartmontools. The result is passed (but the problem still consists):

Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       KXG5AZNV512G TOSHIBA
Serial Number:                      385S1046T31Q
Firmware Version:                   5106AALA
PCI Vendor/Subsystem ID:            0x1179
IEEE OUI Identifier:                0x00080d
Total NVM Capacity:                 512.110.190.592 [512 GB]
Unallocated NVM Capacity:           0
Controller ID:                      0
Number of Namespaces:               1
Namespace 1 Size/Capacity:          512.110.190.592 [512 GB]
Namespace 1 Formatted LBA Size:     512
Local Time is:                      Fri Jan 31 14:18:35 2020 CET
Firmware Updates (0x14):            2 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL *Other*
Optional NVM Commands (0x005f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat *Other*
Maximum Data Transfer Size:         512 Pages
Warning  Comp. Temp. Threshold:     78 Celsius
Critical Comp. Temp. Threshold:     82 Celsius
Namespace 1 Features (0x02):        NA_Fields

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     8.00W       -        -    0  0  0  0        0       0
 1 +     3.90W       -        -    1  1  1  1        0       0
 2 +     2.00W       -        -    2  2  2  2        0       0
 3 -   0.0500W       -        -    3  3  3  3     1500    1500
 4 -   0.0050W       -        -    4  4  4  4     6000   14000
 5 -   0.0030W       -        -    5  5  5  5    50000   80000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         2
 1 -    4096       0         1

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02, NSID 0xffffffff)
Critical Warning:                   0x00
Temperature:                        30 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    3%
Data Units Read:                    5.734.991 [2,93 TB]
Data Units Written:                 6.433.509 [3,29 TB]
Host Read Commands:                 205.819.666
Host Write Commands:                76.731.654
Controller Busy Time:               437
Power Cycles:                       1.362
Power On Hours:                     2.989
Unsafe Shutdowns:                   274
Media and Data Integrity Errors:    0
Error Information Log Entries:      0
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               30 Celsius

Error Information (NVMe Log 0x01, max 128 entries)
No Errors Logged

EDIT 3: It seems like my SSD has some defect. I got into contact with Lenovo. They will send me a new SSD in exchange for my broken one.

Hard to believe this can happen just like that, as smartctl showed me the disc has a Percentage Used: 3% and the laptop isn't even 2 years old. Anything I can do for improving disc sanity in the future?

EDIT 4: I was successful in booting once (from 50 attempts), I was able to timeshift back to an older stable state, since then no more errors (since now at least), machine running like a newborn! I successfully updated everything, no errors here either. I reset my nvme controller and ran sudo fsck -f /dev/nvme0np2, in which all tests passed (thanks @xenoid and @heynnema). I found this link, which described the same symptoms that I had; the solution was to replace SSD and motherboard. Not sure yet if this applies to me too.

EDIT 5: New updates: so first I checked out to temporarily install Windows, but I want to keep this as my last resort, since I'd have to rebuild my whole LInux system. So I thought I could run Windows over an Live USB, but nope, thats not possible, Windows only ever allows a full install (ignoring difficult work-arounds). So I thought I maybe could run the Utility software offered by Lenovo using Wine, but that also didn't work as expected. Using FreeDOS (like suggested in the youtube video) might work, but haven't tried it yet, also Im not sure where to find just the simple iso file of the Toshiba firmware that I would need. Funnily, I didn't manage to find my NVMe model on the Toshiba firmware website. Then I came across fwupd. What a great tool, that is how I like it! And Lenovo even added support of my Thinkpad model, T480s, to LVFS! Great! But not much firmware is uploaded yet for my model. My Toshiba SSD however is listed in LVFS, but the new firmware (like suggested from the Dell website) isn't uploaded yet. I got into contact with Lenovo about this, to speed things up. I also got into contact with Richard Hughes (creator of LVFS) to ask for his help. Since my Laptop isn't suffering from the bug just at the moment, I'll wait a little while, perhaps new developments come up. So, as you see, its been an odysee for me and still going :) Im very grateful for all the help of the community and please let me know if you have more ideas and thoughts!

Edit 6: I tried using FreeDOS Live USB to install the firmware .exe files that I found on the Lenovo and the Dell homepage. But both of them gave me an error message saying cannot be executed in DOS or something like that. This is probably due to these .exe files being utility software, that Lenovo and Dell offer, with a GUI and all. So to run these files, I would actually need to install Windows temporarily on my Laptop.

EDIT 7: Lenovo sent me a new SSD, this time a Samsung. I replaced it with my faulty SSD, installed Windows, performed firmware updates using Lenovo Vantage (just in case). I wanted all firmwares up-to-date, before installing Ubuntu 19.10, which runs really superbly! Especially the kernel-built-in Nvidia drivers are just a blessing, older message errors from Ubuntu 18.04. are all vanished.

  • Thanks @heynnema for this helpful edit! I checked it out, indeed this is a new driver for my SSD that might fix my problem. But how can I install it? The SSD Utility software of Lenovo only seems to work on Windows machines. – MJimitater Feb 04 '20 at 22:22
  • You might try this... https://www.youtube.com/watch?v=WqDUCfU-e-A, or you may have to temporarily install Windows long enough to do the firmware update. Use the Lenovo version, not the Dell version. Read the docs that probably come with the download, or from the web site. They may have other options too. The symptom fix just sounded too much like your symptom. – heynnema Feb 04 '20 at 22:43
  • Status please... – heynnema Feb 05 '20 at 20:29
  • Thank you @heynnema for your help! That new firmware release is exactly my bug! Good catch! Please see my recent EDIT 5 for recent developments. – MJimitater Feb 05 '20 at 20:55
  • You've been busy! Didn't the lenovo link in my Update #2 provide the NVMe firmware that you require? Did you use/try fwupd? – heynnema Feb 05 '20 at 21:06
  • fwupdmgr is already built-in to Ubuntu. – heynnema Feb 05 '20 at 21:13
  • The lenovo link in your Update #2 is a .exe file that you run on Windows and it will detect the right SSD (Samsung, Intel, Toshiba etc.) you have and update the corresponding firmware. Sorry for my naive question: can I simply but this .exe file onto a bootable stick and launch the program by booting of it? – MJimitater Feb 05 '20 at 21:20
  • Yes, I went from fwupd to fwupdmgr – MJimitater Feb 05 '20 at 21:20
  • I don't think you can just drop the .exe on to a flash and use it. You might try at least a portion of the YT video, to create a freedos bootable, and see if that even boots on your computer (http://www.freedos.org/). If it does boot, then download the Lenovo .exe, put it on the flash, and see if you can run the .exe when booted to the flash. Report back... as I'm now learning on this one with you :-) – heynnema Feb 05 '20 at 21:31
  • Yes, even though this bug has caused me a lot of hassle, I did learn a lot, thanks to this great community! Since Im not able to find the driver in a nice .exe or .iso file that I need (5110AALA/51H0AALB for Toshiba KXG5AZNV512G), I will spare any further experiences, as long as my Laptop is up and running - unfornately Lenovo doesn't provide this driver, but only a utility software with gui and all. – MJimitater Feb 06 '20 at 10:17
  • 1
    For fun, you might install this CLI tool... sudo apt-get update and sudo apt-get install nvme-cli. man nvme and nvme help. And see https://github.com/linux-nvme/nvme-cli – heynnema Feb 06 '20 at 14:38
  • Oh yes, great tool! Together with smartmontools, it was one of my first debugging trials. Thankfully, since that hwe kernel update (that @xenoid suggested), I haven't experienced that bug again. I will stick to LVFS (i.e. gnome-firmware) for future firmware updates. – MJimitater Feb 06 '20 at 15:13
  • @heynnema FYI, I actually did try the boot with FreeDOS (as in the youtube vid) with the .exe file from Lenove. But it gave me an error message saying "cannot be executed in DOS" or something like that. Then I went to try the .exe file from Dell, but that yielded the same error message too. – MJimitater Feb 11 '20 at 20:32
  • You'll have to either temporarily install Windows into a fresh partition, or take out the drive and connect it to a Windows desktop where you can do the update. – heynnema Feb 11 '20 at 20:48

2 Answers2

2

It looks like your SSD is failing. The smartmontools package will help you read the SMART data for the disk, possibly using a live CD (because until proven not guilty, better not write on the SSD until it is fully backed up).

Edit: smartmontools didn't show any problem with the SSD. Eyes turned to the controller, and then to the support of the controller by an Ubuntu release that was about the same age. OP tried an HWE (Hardware Enablement) kernel (a variant of the kernel that receives updates for new hardware) and it seems to have fixed this (and improved the general experience).

xenoid
  • 5,504
  • Thanks @xenoid! I did what you said and booting from live USB..I tried using smartmontools, but my SSD wasn't even recognized! Once I was able to boot my original ubuntu, I quickly backed-up everything and running smartmontools gave me no errors (see my edit). Weird... – MJimitater Feb 01 '20 at 11:44
  • What do you see in the system logs (dmesg, /var/log/syslog...)? find / -inum 22152700 point to a possible culprit file? – xenoid Feb 01 '20 at 13:22
  • Thank you. If Im lucky to boot successfully once, I'll let you know about that, but right now I've no chance of running any command – MJimitater Feb 01 '20 at 13:50
  • Now the story gets even weirder... I was lucky to boot for one (in 50 boot attempts), I immediately tried to timeshift back to an older, stable state half a year ago, since I hadn't tried this yet. And boom - everything back to normal! At least so far.. – MJimitater Feb 02 '20 at 15:54
  • Doesn't look good. If it's not your SSD then it"s possibly some motherboard component (Nvme controller?) or just a bad connector? Tried to re-seat the Nvme board? – xenoid Feb 02 '20 at 20:12
  • Thanks for your genius @xenoid! I just reset the nvme controller using sudo nvme reset /dev/nvme0. Not sure what this does though. I found this link describing the same symptoms that I had. The problem was solved in the end by swapping SSD and motherboard. Do you think I need to do that too? – MJimitater Feb 02 '20 at 20:45
  • Too early to tell. Btw, which Ubuntu version and kernel? How recent is the machine model? I would first try to upgrade to a recent kernel if you have recent hardware. – xenoid Feb 02 '20 at 21:06
  • Ubuntu version: 18.04.3; Kernel version: 4.15.0-76-generic (most recent for my OS); Machine model: Thinkpad T80s, not even 2 years old – MJimitater Feb 02 '20 at 21:09
  • Your machine is about as recent as your OS (early 2018) so I wouldn't rule out some compatibility problem. Definitely worth trying a HWE kernel. Unless a recent kernel update is the root of your problems... – xenoid Feb 02 '20 at 22:05
  • Thanks @xenoid! I just updated to 5.3.0-28-generic hwe. Nice! – MJimitater Feb 02 '20 at 22:16
  • 1
    Good. Tell me if it works better in a couple of days. – xenoid Feb 02 '20 at 22:19
  • Hi @xenoid, the hwe kernel works like a charm! You literally turned my whole laptop experience into a brand new one: ubuntu works snappier, is faster with everything, no bugs, no errors, saving and reading almost seems faster, shutting off and waking from hibernation/stand-by works more smoothly! Once in a while I see a short flicker of the screen, literally 1ms, I didn't have that before, but thats totally fine for me, since its not causing any problems and my system seems to work better than ever! Thanks for this tipp! – MJimitater Feb 04 '20 at 22:26
  • BTW, the occassional screen flickering still consists. I found it to occur when I'm in Intel mode (power saving mode), while it doesn't occur when I'm in Nvidia mode (performance mode). – MJimitater Feb 12 '20 at 07:39
1

Let's first check/repair your file system...

  • boot to a Ubuntu Live DVD/USB
  • open a terminal window by pressing Ctrl+Alt+T
  • type sudo fdisk -l
  • identify the /dev/sdXX device name for your "Linux Filesystem"
  • type sudo fsck -f /dev/sdXX, replacing sdXX with the number you found earlier
    • sudo fsck -f /dev/nvme0n1p2
  • repeat the fsck command if there were errors
  • type reboot

Second, make sure to properly shutdown your system using the shutdown menu, NOT via holding down the power button.

Update #1:

NVMe not seen in BIOS. Suspect defective NVMe drive, or cable/connection, or firmware. Recommend warranty repairs.

Update #2:

There's a Toshiba firmware update for your NVMe which fixes this problem. Also check the Lenovo site for a Lenovo-specific NVMe firmware update, based on your model number T480s.

See Dell web site and Lenovo web site.

This package contains the firmware for Toshiba KXG5AZNV256G 256GB, KXG5AZNV512G 512GB, and KXG5AZNV1T02 1TB SED M.2 2280, Revision AADA5105. Storage firmware is a microcode that is embedded on storage devices such as hard drives or solid-state drives. The firmware manages the functionality of the devices. It fixes the issue where an error message occurs when the drive is not detected and improves the performance of the solid-state drive (SSD).
Get the latest driver
Please enter your product details to view the latest driver information for your system.
Fixes & Enhancements

Fixes:

- Fixed the issue where an error message occurs when the drive is not detected.

Enhancements:

- Improved the performance of the solid-state drive (SSD).

Version
Version AADA5105, A00
Category
Serial ATA
Release date
06 Aug 2019
Last Updated
01 Jan 2020

and...

enter image description here

heynnema
  • 70,711