10

I'm running Kubuntu 20.04. I recently cloned my system from a 2.5" SSD over to a new 2280 (via dd - aka it's an exact replica of the previous install). Everything runs smoothly, however I've noticed that sometimes I see the following during bootup or shutdown:

[  125.110891] pcieport 0000:00:1d.0: AER: Corrected error received: 0000:04:00.0
[  125.110895] nvme 0000:04:00.0: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[  125.110898] nvme 0000:04:00.0: AER:   device [10ec:5762] error status/mask=00000001/00006000
[  125.110899] nvme 0000:04:00.0: AER:    [ 0] RxErr                 
[  125.118946] pcieport 0000:00:1d.0: AER: Corrected error received: 0000:04:00.0
[  125.118950] nvme 0000:04:00.0: AER: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
[  125.118952] nvme 0000:04:00.0: AER:   device [10ec:5762] error status/mask=00000001/00006000
[  125.118954] nvme 0000:04:00.0: AER:    [ 0] RxErr
...repeating

It only happens maybe 1 out of 10 reboots, and I've never seen any actual behavioral issues (i.e. crashes, etc). As a small side note, the only other difference I've noticed since moving over to the new SSD is that system audio is very subtly choppy (as described here).

I've found some other posts that suggest getting rid of the "PCIe Bus Error" by adding pci=nomsi and pci=noaer to /etc/default/grub, but those all seem to be addressing other issues (i.e. Ubuntu can't install, or other behavioral problems). Some posts suggest that the OS or kernel may just be too out-of-date for the particular hardware, so since I've been wanting to switch to Neon anyway, I tried a fresh install of Neon 5.24 (different partition). Unfortunately, the behavior was the same: 100% functional OS, just very occasionally pages of the above message shown during bootup or shutdown.

  1. Is there a chance that the actual physical SSD is faulty? It was purchased new.
  2. If not, is the above of concern?
  3. Is there a good solution? My understanding is that pci=noaer just tells it to disable advanced error reporting, which doesn't really seem like the best solution.

The system is a Dell Latitude 5490, and the BIOS is up to date. The SSD is a Teamgroup MP34 4TB (if that's relevant).

J23
  • 391

2 Answers2

8

The solution was adding pci=nommconf to the kernel boot parameters, which disables Memory-Mapped PCI Configuration Space & reverts to the traditional handling of configuration space

Solution was found here.

J23
  • 391
0

For me:

pcie_aspm=off

Fixed the error (but for me, this is linked to GTX 660, not SSD). But heard that it works for SSD, in fact taken from: https://forums.unraid.net/topic/118286-nvme-drives-throwing-errors-filling-logs-instantly-how-to-resolve/?do=findComment&comment=1165150

From the above link:

Could you try pcie_aspm=off . This seems to disable power management mode which is throwing the error.. I've put it in my config for next time I reboot.

stumblebee
  • 3,547
paulduf
  • 16