I successfully installed Ubuntu 18.04 LTS on an Alienware M17 with RTX 2080 Max Q and it worked great until it suddenly stopped. I had installed the 418.56 driver, CUDA version 10.1, following the instructions by Terrance here. At first, everything worked fine. I was getting the expected performance with Tensorflow, about half that of the desktop version of the RTX 2080 (the Max Q runs at half the clock). Then I let that Tensorflow benchmark run all night. In the morning, performance was down to 30% compared to before. Ever since, I get reduced performance. For example, only 9.3 fps in the Heaven Benchmark (it should be at least around 100 fps). The only cause I can think of is that I unplugged the power cord for a minute, with the benchmark running.
What I've found so far is that the GPU clock is now way too low: nvidia-smi shows the GPU clock at between 75 and 247 MHz, even under load (instead of 300..2100). I can manually set the GPU clock with nvidia-smi, but only about 1/4 of the value I'm setting. For example, nvidia-smi --lock-gpu-clocks=300,300 results in the GPU clock being reported at 75 MHz. If I set higher values, it maxes out at around 350 MHz. All other info from nvidia-smi looks normal. Performance state is P0. All "Clocks Throttle Reasons" are "Not Active".
The mystery for me is why it worked fine in the beginning, and then it suddenly slowed down (possibly when I unplugged the power cord). I tried changing settings, drivers, and everything else I could think of. The problem persists despite a complete re-install of 18.04 from scratch, following the same instructions that worked the first time. It's slowed down ever since, despite reboots, re-installation, etc. I would suspect hardware failure, but everything is fine when I boot into Windows 10.
Any help would be much appreciated.