Background:
Last year I got a new HP computer. It is a HP Spectre x360 14-ea0775ng .
I installed Ubuntu in it:
$ uname -a
Linux my-HPPC 5.13.0-40-generic #45~20.04.1-Ubuntu SMP Mon Apr 4 09:38:31 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
and I noticed that every now and then (maybe once every two days) it "crashes". The crash is gradual: maybe a single window freezes, and when I move to another window, the other one freezes too, until eventually the mouse freezes. Some time ago, when it started, I pressed Ctrl+Alt+F1
and saw the following error:
# For future search engines, here is one of them:
[91325.503864] pcieport 10000:e0:1d.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Receiver ID)
[91325.503920] pcieport 10000:e0:1d.0: device [8086:a0b0] error status/mask=00200000/00000000
[91325.503964] pcieport 10000:e0:1d.0: [21] ACSViol (First)
It keeps printing this forever... and eventually even the keyboard becomes irresponsive. Still, the camera button and the keyboard light remain working, as well as the screen (who keeps printing this ominous message). Currently, I'm planning to leave another computer pinging an SSH server on it to check whether it also stops the network.
By now, I saw similar errors all around the internet, but none that looks exactly like mine, and none of them seem to deal with "ACSViol".
What I found so far is that ACSViol means "ACS Violation", where ACS is something used to transfer data between devices (?). Also, device 8086:a0b0 seem to be a PCI Express Root Port #9. Here is what I sense to be the relevant part of lspci -v
:
10000:e0:1d.0 PCI bridge: Intel Corporation Device a0b0 (rev 20) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 147
Bus: primary=00, secondary=e1, subordinate=e1, sec-latency=0
I/O behind bridge: [disabled]
Memory behind bridge: 6a000000-6a0fffff [size=1M]
Prefetchable memory behind bridge: [disabled]
Capabilities: <access denied>
Kernel driver in use: pcieport
However, I must say almost none of this makes sense to me...
Question:
So... my main concern is: is this a hardware problem? (or, also, how can I figure that out?) Anything in that direction would be useful enough, since I seem to be stuck here.
Or is this just a software bug (something about the way Linux deals with the PCIe bus), and if I keep updating my Ubuntu it will vanish eventually, because I can trust that some people with good will will fix this problem?