4

I recently upgraded to Ubuntu 22.04 and my Dell laptop started shutting down unexpectedly. It seems like my CPUs are spiking to 100%, then shutting down.

In many cases I can’t do the most simple task, like opening VS Code, in others it shuts down seconds after I log in. Sometimes I can see an error saying it shut down due to temperature, but the machine is always very cool to the touch.

This never happened on Ubuntu 20.04.

Laptop Specs:

  • Dell Inc. Inspiron 15 3510
  • 16 GB Memory
  • CPU: Intel® Pentium(R) Silver N5030 CPU @ 1.10GHz × 4
  • Graphics: Mesa Intel® UHD Graphics 605 (GLK 3)
  • Disk: 512 GB
  • Roughly 3 months old.

Here are some useful logs:

  • Hardware Logs:

    16:07:45 kernel: thermal thermal_zone0: acpitz: critical temperature reached, shutting down
    16:07:22 kernel: iwlwifi 0000:00:0c.0: Conflict between TLV & NVM regarding enabling LAR (TLV = enabled NVM =disabled)
    16:07:21 kernel: usb 1-5: Found UVC 1.00 device Integrated_Webcam_HD (0c45:6d1a)
    16:07:20 kernel: snd_hda_codec_realtek hdaudioC0D0:      Internal Mic=0x12
    16:07:20 kernel: iwlwifi 0000:00:0c.0 wlo2: renamed from wlan0
    16:07:20 kernel: mei_hdcp 0000:00:0f.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04: bound 0000:00:02.0 (ops i915_hdcp_component_ops [i915])
    16:07:20 kernel: snd_hda_intel 0000:00:0e.0: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915])
    16:07:20 kernel: i915 0000:00:02.0: [drm] fb0: i915drmfb frame buffer device
    16:07:20 kernel: ieee80211 phy0: Selected rate control algorithm 'iwl-mvm-rs'
    16:07:20 kernel: hid-multitouch 0018:27C6:0D43.0001: input,hidraw0: I2C HID v1.00 Mouse [DELL0AAF:00 27C6:0D43] on i2c-DELL0AAF:00
    16:07:20 kernel: iwlwifi 0000:00:0c.0: base HW address: 20:1e:88:4e:c4:ce
    16:07:20 kernel: thermal thermal_zone7: failed to read out thermal zone (-61)
    16:07:20 kernel: iwlwifi 0000:00:0c.0: Detected Intel(R) Wireless-AC 9462, REV=0x318
    16:07:20 kernel: dell-smbios A80593CE-A997-11DA-B012-B622A1EF5492: WMI SMBIOS userspace interface not supported(0), try upgrading to a newer BIOS
    16:07:20 kernel: iwlwifi 0000:00:0c.0: loaded firmware version 46.fae53a8b.0 9000-pu-b0-jf-b0-46.ucode op_mode iwlmvm
    16:07:19 kernel: ee1004 1-0050: 512 byte EE1004-compliant SPD EEPROM, read-only
    16:07:19 kernel: iwlwifi 0000:00:0c.0: enabling device (0000 -> 0002)
    16:07:19 kernel: intel-hid INT33D5:00: platform supports 5 button array
    16:07:19 kernel: evdi evdi.3: [drm] Cannot find any crtc or sizes
    16:07:19 kernel: hid-generic 0003:1532:009C.0006: input,hiddev1,hidraw5: USB HID v1.11 Device [Razer Razer DeathAdder V2 X HyperSpeed] on usb-0000:00:15.0-1/input3
    16:07:19 kernel: usb 1-9: New USB device strings: Mfr=0, Product=0, SerialNumber=0
    16:07:19 kernel: hid-generic 0018:27C6:0D43.0001: input,hidraw0: I2C HID v1.00 Mouse [DELL0AAF:00 27C6:0D43] on i2c-DELL0AAF:00
    16:07:19 kernel: sd 0:0:0:0: [sda] Attached SCSI disk
    16:07:19 kernel: scsi 0:0:0:0: Direct-Access     ATA      SSD SATA3 512GB  0A0  PQ: 0 ANSI: 5
    16:07:19 kernel: usb 1-1: SerialNumber: 000000000000
    16:07:19 kernel: idma64 idma64.7: Found Intel integrated DMA 64-bit
    16:07:19 kernel: i2c i2c-1: Successfully instantiated SPD at 0x50
    16:07:19 kernel: idma64 idma64.2: Found Intel integrated DMA 64-bit
    16:07:19 kernel: i2c i2c-1: 1/1 memory slots populated (from DMI)
    16:07:19 kernel: scsi host1: ahci
    16:07:19 kernel: ahci 0000:00:12.0: flags: 64bit ncq sntf pm clo only pmp pio slum part deso sadm sds apst 
    16:07:19 kernel: hub 2-0:1.0: 7 ports detected
    16:07:19 kernel: usb usb2: SerialNumber: 0000:00:15.0
    16:07:19 kernel: xhci_hcd 0000:00:15.0: Host supports USB 3.0 SuperSpeed
    16:07:19 kernel: hub 1-0:1.0: 9 ports detected
    16:07:19 kernel: usb usb1: SerialNumber: 0000:00:15.0
    16:07:19 kernel: xhci_hcd 0000:00:15.0: hcc params 0x200077c1 hci version 0x100 quirks 0x0000000000009810
    16:07:19 kernel: ahci 0000:00:12.0: version 3.0
    16:07:19 kernel: i801_smbus 0000:00:1f.1: SMBus using PCI interrupt
    16:07:19 kernel: idma64 idma64.1: Found Intel integrated DMA 64-bit
    16:07:19 kernel: acpi PNP0C14:01: duplicate WMI GUID 05901221-D566-11D1-B2F0-00A0C9062910 (first instance was on PNP0C14:00)
    16:07:19 kernel: wmi_bus wmi_bus-PNP0C14:00: WQBC data block query control method not found
    16:07:19 kernel: platform eisa.0: EISA: Detected 0 cards
    16:07:19 kernel: rtc_cmos 00:04: alarms up to one month, y3k, 242 bytes nvram, hpet irqs
    16:07:19 kernel: thermal LNXTHERM:00: registered as thermal_zone0
    16:07:19 kernel: pcieport 0000:00:14.1: PME: Signaling with IRQ 124
    16:07:19 kernel: pci_bus 0000:00: resource 19 [mem 0xfed80000-0xfedbffff window]
    16:07:19 kernel: pci 0000:00:14.1: PCI bridge to [bus 03]
    16:07:19 kernel: system 00:01: [mem 0xfee00000-0xfeefffff] could not be reserved
    16:07:19 kernel: pci 0000:00:02.0: vgaarb: bridge control possible
    16:07:19 kernel: pci_bus 0000:00: root bus resource [mem 0xfed80000-0xfedbffff window]
    
  • Important logs:

    16:07:45 canonical-livep: daemon shutting down
    16:07:45 gdm3: Gdm: Failed to contact accountsservice: Error calling StartServiceByName for org.freedesktop.Accounts: Transaction for accounts-daemon.service/start is destructive (dev-disk-by\x2dpath-pci\x2d0000:00:12.0\x2data\x2d1.0\x2dpart5.swap has 'stop' job queued, but 'start' is included in transaction).
    16:07:45 systemd-logind: Failed to start autovt@tty2.service: Transaction for getty@tty2.service/start is destructive (poweroff.target has 'start' job queued, but 'stop' is included in transaction).
    16:07:45 kernel: reboot: HARDWARE PROTECTION shutdown (Temperature too high)
    16:07:45 kernel: thermal thermal_zone0: acpitz: critical temperature reached, shutting down
    16:07:40 systemd: Failed to start Application launched by gnome-session-binary.
    16:07:38 gdm-session-wor: gkr-pam: unable to locate daemon control file
    16:07:22 canonical-livep: Task "refresh" returned an error: livepatch check failed: POST request to "https://livepatch.canonical.com/v1/client/eee7feecac2a487db8eed9aef9ab1d79/updates" failed, retrying in 30s.
    16:07:22 gnome-session-b: GLib-GIO-CRITICAL: g_bus_get_sync: assertion 'error == NULL || *error == NULL' failed
    16:07:19 kernel: x86/cpu: SGX disabled by BIOS.
    
  • 1
    Well, what do your logs say? Keep in mind that Ubuntu-caused shutdown are always logged so you can discover the reason. If nothing is logged, then you have a hardware problem that is coincidentally occuring after the release-upgrade. Warnings about temperature fall under hardware (not Ubuntu) faults. – user535733 May 04 '22 at 23:22
  • First, make sure all your sensors are working. Install lm-sensors package by running the following commands: sudo apt update and sudo apt install lm-sensors Then, run the following command to detect your sensors (select y or yes when prompted): sudo sensors-detect – mchid May 04 '22 at 23:29
  • 4
    The machine being "cool to the touch" doesn't mean the processor hasn't exceeded its maximum temperature. That can all happen on a millisecond time scale. See this, where I have measured a processor temperature increase rate of 800 degrees per second (which obviously would slow down as it gets higher). – Doug Smythies May 04 '22 at 23:32
  • @user535733 I have just added the logs. – Rafael Zasas May 04 '22 at 23:34
  • 1
    @mchid I installed and ran the package. There was a lot of prompts for probing (All of which I said yes to). This was the result:
      * Chip 'Intel digital thermal sensor' (confidence: 9)
    
    – Rafael Zasas May 04 '22 at 23:35
  • @mchid I have just added the Hardware and important logs in code blocks as requested. I can add the systems ; applications or all logs too if need be. – Rafael Zasas May 04 '22 at 23:44
  • 1
    How's your Fan? Rotating, unblocked? How dusty is the inside of the computer (dust is an insulator - keeps the heat in)? – waltinator May 04 '22 at 23:44
  • 1
    @waltinator The fan is quiet. I used compressed air to remove any dust if there was any. PC is roughly 3 months old. I have added PC Specs to the question. – Rafael Zasas May 04 '22 at 23:47
  • Please be precise with details; 20 & 22 are different Ubuntu products to 20.04 & 22.04; ie. the year format used for snap only releases where as the year.month format for deb based products that can also use snap packages. Your question & tag mixes two different products that require a re-install & don't upgrade (if a 20 upgrades to 22 no user packages will change; unlike when a 20.04 upgrades to 22.04 which requires the whole system to upgrade - they differ) – guiverc May 05 '22 at 01:15
  • @guiverc I believe I was fairly explicit in stating that I am using Ubuntu 22.04 (Jammy Jellyfish), not ubuntu 20. I recently upgraded, and then started experiencing these issues. – Rafael Zasas May 05 '22 at 15:09
  • I suggest a thermal monitoring/throttling daemon such as thermald (which might already be running, don't know), with a low enough trip point to prevent temperature overshoot to the shutdown point. Your processor TDP (Thermal design Power) is very low. – Doug Smythies May 05 '22 at 15:28
  • @RafaelZasas you state Ubuntu 22 a number of times, implying a snap only product of Ubuntu; Ubuntu Core 22 is based on 22.04 but is a different product; with the 22 or year format highlighting the different product when compared to year.month format used for deb based products of Ubuntu. 22 is explicitly different to 22.04, just as 20 is explicitly different to 20.04. – guiverc May 05 '22 at 22:38
  • Same problem here, could be relatd with this bug? https://bugs.launchpad.net/ubuntu/+source/gnome-session/+bug/1968907 – bcode Jul 06 '22 at 06:59

1 Answers1

0

Try loading the coretemp module at an earlier stage:

 printf "# BUGFIX: pre load coretemp to fix Laptop Thermal Shutdown Bug\ncoretemp\n" | sudo tee /etc/modules-load.d/bugfix-coretemp.conf

Reboot and see if the problem gets fixed.