1

Similar to the issue discussed here, I have trouble with the configuration of amdgpu-dkms. Installation of the optional amdgpu driver hangs on that step. Here is the output of dpkg --configure amdgpu-dkms


    Setting up amdgpu-dkms (1:5.4.7.53-1048554) ...
Removing old amdgpu-5.4.7.53-1048554 DKMS files...

Deleting module version: 5.4.7.53-1048554 completely from the DKMS tree.

Done. Loading new amdgpu-5.4.7.53-1048554 DKMS files... Building for 5.4.0-96-generic Building for architecture x86_64 Building initial module for 5.4.0-96-generic ERROR: Cannot create report: [Errno 17] File exists: '/var/crash/amdgpu-dkms-firmware.0.crash' Error! Bad return status for module build on kernel: 5.4.0-96-generic (x86_64) Consult /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/make.log for more information. dpkg: error processing package amdgpu-dkms (--configure): installed amdgpu-dkms package post-installation script subprocess returned error exit status 10 Errors were encountered while processing: amdgpu-dkms

The log file referenced contains the following text:

DKMS make.log for amdgpu-5.4.7.53-1048554 for kernel 5.4.0-96-generic (x86_64)
Sat 29 Jan 2022 06:43:23 AM CST
make: Entering directory '/usr/src/linux-headers-5.4.0-96-generic'
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/symbols.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_mn.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_drv.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/main.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_device_cgroup.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_device.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_drm_cache.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_drm.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_kms.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_fence_array.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_fence.o
/var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_fence.c:29:1: warning: ‘dma_fence_test_signaled_any’ defined but not used [-Wunused-function]
   29 | dma_fence_test_signaled_any(struct dma_fence **fences, uint32_t count,
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_io.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_atombios.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_kthread.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_mm.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/atombios_crtc.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_pci.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_connectors.o
/var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_pci.c: In function ‘amdkcl_pci_init’:
/var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_pci.c:102:84: warning: passing argument 2 of ‘amdkcl_fp_setup’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
  102 |  _kcl_pcie_link_speed = (const unsigned char *) amdkcl_fp_setup("pcie_link_speed", _kcl_pcie_link_speed_stub);
      |                                                                                    ^~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_pci.c:3:
/var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_common.h:23:63: note: expected ‘void *’ but argument is of type ‘const unsigned char *’
   23 | static inline void *amdkcl_fp_setup(const char *symbol, void *fp_stup)
      |                                                         ~~~~~~^~~~~~~
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_perf_event.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_reservation.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/atom.o
/var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_reservation.c: In function ‘amdkcl_reservation_init’:
/var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_reservation.c:58:10: warning: passing argument 2 of ‘amdkcl_fp_setup’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-array-qualifiers]
   58 |          &_kcl_reservation_seqcount_string_stub);
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_reservation.c:32:
/var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_common.h:23:63: note: expected ‘void *’ but argument is of type ‘const char (*)[21]’
   23 | static inline void *amdkcl_fp_setup(const char *symbol, void *fp_stup)
      |                                                         ~~~~~~^~~~~~~
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/dma-resv.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_suspend.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_fence.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_workqueue.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_seq_file.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/kcl_connector.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_ttm.o
  LD [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdkcl/amdkcl.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/scheduler/sched_main.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_object.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/scheduler/sched_fence.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_gart.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/scheduler/sched_entity.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_encoders.o
  LD [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/scheduler/amd-sched.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/ttm/ttm_memory.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_display.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/ttm/ttm_tt.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_i2c.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_fb.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/ttm/ttm_bo.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_gem.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/ttm/ttm_bo_util.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_ring.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_cs.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/ttm/ttm_bo_vm.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_bios.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/ttm/ttm_module.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/ttm/ttm_execbuf_util.o
/var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_bios.c: In function ‘amdgpu_read_platform_bios’:
/var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_bios.c:200:9: error: implicit declaration of function ‘pci_platform_rom’ [-Werror=implicit-function-declaration]
  200 |  bios = pci_platform_rom(adev->pdev, &size);
      |         ^~~~~~~~~~~~~~~~
/var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_bios.c:200:7: warning: assignment to ‘uint8_t *’ {aka ‘unsigned char *’} from ‘int’ makes pointer from integer without a cast [-Wint-conversion]
  200 |  bios = pci_platform_rom(adev->pdev, &size);
      |       ^
cc1: some warnings being treated as errors
make[2]: *** [scripts/Makefile.build:270: /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu/amdgpu_bios.o] Error 1
make[1]: *** [scripts/Makefile.build:519: /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/amd/amdgpu] Error 2
make[1]: *** Waiting for unfinished jobs....
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/ttm/ttm_page_alloc.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/ttm/ttm_bo_manager.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/ttm/ttm_agp_backend.o
  CC [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/ttm/ttm_page_alloc_dma.o
  LD [M]  /var/lib/dkms/amdgpu/5.4.7.53-1048554/build/ttm/amdttm.o
make: *** [Makefile:1762: /var/lib/dkms/amdgpu/5.4.7.53-1048554/build] Error 2
make: Leaving directory '/usr/src/linux-headers-5.4.0-96-generic'

After attempting to install and configure amdgpu, the default driver, Radeon, appears to have broken. It is still installed, but no longer controls my gpu. Systemctl status gpu-manager.service yields

Gpu-manager.service - detect the available gpus and deal with any system changes
   Loaded: loaded (/lib/system/gpu-manager.service; enabled; vender preset: enabled)
   Active: inactive (dead)

Based on this post and other research (just google the issue) it seems that amdgpu is only compatible with some kernels. I have already attempted purging the packages using apt purge amdgpu. I do not know how to fix 'radeon,' the default driver. Should I just give up trying to use this newer driver and focus on reverting to the default? Ideally, I would like to have both drivers available as options so that I can revert to the default if/when amdgpu breaks.

Here are my hardware specs. I will edit to add additional information as necessary.

Processor : Intel(R) Pentium(R) CPU G3258 @ 3.20GHz Memory : 8041MB (1523MB used) Graphics : Radeon R7 240 (2GB) Machine Type : Desktop Operating System : Ubuntu 20.04.3 LTS

-SCSI Disks- ATA KINGSTON SA400S3 (SSD, boot drive) ATA Samsung SSD 860 (SSD) WDC WD25 00BEVT-60A23T0 (HDD)

Nah Tan
  • 11
  • What output do you get when running lspci | grep -i VGA? – Nate T Jan 31 '22 at 22:15
  • Did you ever run uname -r? If I know your actual kernel name, I can tell you which version of the app / pkg you need. Since youve already installed via pkg mgr, it is a matter of swapping a few files and telling dpkg to hold the configuration, and youll have a working and managed version. It wont update, but it also wont delete or change any of the dependencies. And with the dkms pkg, you may be able to skip the hold. Simply adding the correct version to start may be all thats needed to get the dkms feature working properly... – Nate T Feb 08 '22 at 01:49
  • I have downloaded and installed two kernel versions since making my original post. 5.4.0-97 was installed automatically when I ran apt-get full-upgrade. I downloaded and installed 5.16.0-051600-generic manually. (I would have to reference the guide I used if you want to know the details of that.) When I run uname -r the kernel version that results corresponds to the selection I chose in grub. Each kernel version only works if I boot up without graphics. (ie in recovery mode). – Nah Tan Feb 08 '22 at 14:27
  • lspci | grep -i VGA yields 00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06) 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Oland PRO [Radeon R7 240/340] (rev 87) – Nah Tan Feb 08 '22 at 14:27

4 Answers4

2

Tl;dr... Report it! Link below.

From your output:

Building for 5.4.0-96-generic

Assuming that you have stock 20.04.4, the 5.4 kernel is NOT what you are using. The module is not detecting the kernel version properly.

It is deleting the old version and then just rebuilding the exact same one.

The kernel in the output is either dead on, or at least very close to, the first version to be released with Ubuntu 20.04.1.

This is definitely a bug. It seems to be an issue with the module's dkms implementation. It should be reported here.

It is important to report bugs like this. The user feedback via issue trackers like the one linked are the developers' primary info source for finding them. In most cases, especiallly in open-source, a developer's only tools for finding edge-case bugs are the issue trackers and dumb luck during testing.

In other words, if you haven't reported it, dont count on it getting fixed any time soon. It has obviously been around since 20.04.1, and everything Ive seen online points to AMDGPU having a relatively fast development cycle (i.e. a lot of fixes, a lot of version upgrades), so chances are that they haven't seen it. It is possibly specific to your graphics card.

Nate T
  • 1,524
  • Even if it has already been reported, you should still report, so they can guage the severity, among other reasons. See this answer. – Nate T Jan 31 '22 at 23:19
  • I believe it is 20.04.3, based on the output from hardinfo, but yeah that's an old kernel. – Nah Tan Feb 01 '22 at 04:31
1

just in case you are trying this on 20.04.4, the latest version this works for with amdgpu 21.40.2 is Ubuntu 20.04.3 So you either need 21.40.2 with Ubuntu 20.04.3 or wait a little longer for a driver for 20.04.4

Questi
  • 11
1

I ran in to a very similar issue after performing a kernel upgrade from kernel version 5.11.0-44 to 5.13.0-27. I downgraded to Kernel version 5.11.0-44 and it continues to build and install fine.

To get rid of the dkms build errors, I had to remove all kernel versions that were broken with the driver build.

Perhaps you can try Kernel 5.11.0?

Tim Harper
  • 111
  • 2
0

Your milage may vary, but for me the amd installer in its wisdom created:

/etc/modprobe.d/blacklist-amdgpu.conf containing:

blacklist amdgpu

Which is why that module didn't get loaded after reboot, which in turn caused gdm (and Xorg) to fail to start. I removed that file, and all is well again.