0

I upgraded my working Ubuntu 21.10 to 22.04 (on my MacBookAir6,1), and upon rebooting the default kernel 5.15.0-25 ran for a few seconds, gave a bunch of DMAR faults, "gave up waiting for root file system device," and dropped out into busybox. The dmesg notabily reported something like this

ata1.00: failed to IDENTIFY

The Ubuntu 21.10 (and versions before that going back 5 years) had worked fine on this machine. The new 22.04 had thankfully left a 5.13.0-40 as an GRUB advanced boot option - and that boots normally and completely (and apparently can see/boot from the internal HD). The dmesg from 5.13.0-40 had no DMAR faults nor ata1 failures. The 5.13.0-40 dmesg showed successful discovery of internal hard disk

[1.188421] ata1: SATA max UDMA/133 abar m512@0xb0700000 port 0xb0700100 irq 53
[1.495830] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[1.497677] ata1.00: unexpected _GTF length (8)
[1.498794] ata1.00: ATA-8: APPLE SSD TS0128F, 109R0219, max UDMA/100
[1.499435] ata1.00: 236978176 sectors, multi 0: LBA48 NCQ (depth 32)
[1.500647] ata1.00: unexpected _GTF length (8)
[1.501444] ata1.00: configured for UDMA/100

I am also able to normally boot Ubuntu 22.04 from a USB disk (made with the rufus utility), with the strange situation that it also couldn't see the internal harddisk (from fdisk, gnome disk utility, or gparted). It was able to see the USB disk it booted from. Here are some excerpts from its 5.15.0-25 kernel dmesg

[1.449514] ata1: SATA max UDMA/133 abar m512@0xb0700000 port 0xb0700100 irq 54
[1.766097] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[1.766382] DMAR: DRHD: handling fault status reg 3
[1.766427] DMAR: [DMA Write NO_PASID] Request device [04:00.1] fault addr 0xfffe0000 [fault reason 0x02] Present bit in context entry is clear
[1.766525] ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)

and later..

[ 7.032539] DMAR: DRHD: handling fault status reg 2
[ 7.032595] DMAR: [DMA Write NO_PASID] Request device [04:00.1] fault addr 0xfffe0000 [fault reason 0x02] Present bit in context entry is clear
[ 7.314168] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 7.314403] DMAR: DRHD: handling fault status reg 3
[ 7.314445] DMAR: [DMA Write NO_PASID] Request device [04:00.1] fault addr 0xfffe0000 [fault reason 0x02] Present bit in context entry is clear
[ 7.314533] ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)
[ 7.314538] ata1: limiting SATA link speed to 3.0 Gbps
[12.408499] DMAR: DRHD: handling fault status reg 2
[12.408543] DMAR: [DMA Write NO_PASID] Request device [04:00.1] fault addr 0xfffe0000 [fault reason 0x02] Present bit in context entry is clear
[12.690278] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[12.690566] DMAR: DRHD: handling fault status reg 3
[12.690597] ata1.00: failed to IDENTIFY (INIT_DEV_PARAMS failed, err_mask=0x80)

I am not sure if the DMAR is related, as there are some additional DMAR entries in the new kernel

[0.269655] DMAR: No ATSR found
[0.269657] DMAR: No SATC found
[0.269660] DMAR: IOMMU feature pgsel_inv inconsistent
[0.269664] DMAR: IOMMU feature sc_support inconsistent
[0.269666] DMAR: IOMMU feature pass_through inconsistent
[0.269669] DMAR: dmar0: Using Queued invalidation
[0.269682] DMAR: dmar1: Using Queued invalidation

I am stuck..

Bill
  • 141

2 Answers2

1

After more detailed comparisons between dmesg of the kernel versions, it was apparent that pci device 04:00.1 was only referenced on the problematic kernel 5.15.0-25, where it had DMAR faults on device 04:00.1 (on MacBookAir6,1). PCI device 04:00.0 shown by lspci is

04:00.0 SATA controller: Toshiba Corporation Device 010b (rev 14)

The problematic 04:00.1 - function number 1, of SATA controller, caused DMAR faults. Search on DMAR fault addr and reason turned up an old discussion regarding the intel_iommu kernel option becoming ON by default for newer kernels. This caused problems for some machines. To fix it on MacBookAir6,1, the kernel option

intel_iommu=off

added to the GRUB config let my system boot up completely like before.

Bill
  • 141
  • same bug reported by Matt - https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1970559 – Bill Apr 28 '22 at 14:46
  • the 2017 archlinux thread on the intel_iommu issues - https://bbs.archlinux.org/viewtopic.php?pid=1741388#p1741388 – Bill Apr 28 '22 at 14:52
1

Bill's answer solved my problem and saved me a lot of time. Thanks.

For a live USB system (e. g. for fresh installs), select the GRUB entry you want to boot (Install Ubuntu), press 'e' and add the temporary parameter 'intel_iommu=off' after the three dashes of the kernel line. Then press one of the boot keys. See this post for more details: https://askubuntu.com/a/19487

The parameter then got automatically applied in my installation and was inserted into /etc/default/grub which now has the entry

GRUB_CMDLINE_LINUX="intel_iommu=off"

Cheers, M.