1

My webserver runs Ubuntu Server 20.04.4 (kernel 5.4.0-124) on a Intel Core i7-7700 CPU @ 3.60GHz. It's a dedicated, physical Supermicro server hosted in a remote datacenter.

# lshw

.... -core description: Motherboard product: X11SSD-F vendor: Supermicro ... -cpu description: CPU product: Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz vendor: Intel Corp. physical id: 12 bus info: cpu@0 version: Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz serial: To Be Filled By O.E.M. slot: CPU size: 800MHz capacity: 4200MHz width: 64 bits clock: 100MHz capabilities: lm fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp x86-64 constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d arch_capabilities cpufreq configuration: cores=4 enabledcores=4 threads=8

https://askubuntu.com/questions/916382/ubuntu-get-actual-current-cpu-clock-speed

lscpu | grep MHz

CPU MHz: 800.010 CPU max MHz: 4200.0000 CPU min MHz: 800.0000

But, even under 100% load, the max freq is capped at 800 MHz:

# watch -n1 "cat /proc/cpuinfo | grep -i Hz"

model name : Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz cpu MHz : 800.016 model name : Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz cpu MHz : 800.023 model name : Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz cpu MHz : 800.032 model name : Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz cpu MHz : 800.024 model name : Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz cpu MHz : 800.011 model name : Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz cpu MHz : 800.021 model name : Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz cpu MHz : 800.010 model name : Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz cpu MHz : 800.038

sudo apt install i7z -y

sudo i7z

Cpu speed from cpuinfo 3600.00Mhz cpuinfo might be wrong if cpufreq is enabled. To guess correctly try estimating via tsc Linux's inbuilt cpu_khz code emulated now True Frequency (without accounting Turbo) 3599 MHz CPU Multiplier 36x || Bus clock frequency (BCLK) 99.97 MHz

Socket [0] - [physical cores=4, logical cores=8, max online cores ever=4] TURBO ENABLED on 4 Cores, Hyper Threading ON Max Frequency without considering Turbo 3698.97 MHz (99.97 x [37]) Max TURBO Multiplier (if Enabled) with 1/2/3/4 Cores is 36x/36x/36x/36x Real Current Frequency 799.78 MHz [99.97 x 8.00] (Max of below) Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % Temp VCore Core 1 [0]: 799.78 (8.00x) 13.3 97 0 0 27 0.6636 Core 2 [1]: 799.78 (8.00x) 25 90.1 1.4 2.91 27 0.6636 Core 3 [2]: 799.77 (8.00x) 3.61 95.7 1 2.55 27 0.6639 Core 4 [3]: 799.78 (8.00x) 100 77.8 0 0 29 0.6644

Via cpufreq-info I found out that the governor was set to powersave, so I immediately switched to performance (and rebooted)

https://askubuntu.com/a/1049313/181869
sudo apt install cpufrequtils -y
echo 'GOVERNOR="performance"' | sudo tee /etc/default/cpufrequtils
sudo systemctl ondemand disable

cpufreq-set -g performance -r

cpufreq-info

analyzing CPU 0: driver: intel_pstate CPUs which run at the same hardware frequency: 0 CPUs which need to have their frequency coordinated by software: 0 maximum transition latency: 4294.55 ms. hardware limits: 800 MHz - 4.20 GHz available cpufreq governors: performance, powersave current policy: frequency should be within 800 MHz and 4.20 GHz. The governor "performance" may decide which speed to use within this range. current CPU frequency is 800 MHz.

<repeat for each other core>

cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

performance performance performance performance performance performance performance performance

Disabling ondemand was paramount: if I didn't do it, ondemand switched the governor back to powersave.

Still, it didn't made any difference MHz-wise: the cores are still capped to 800 MHz, no matter how much load there is.

The temps are super-low, so it's not a thermal-throttle:

# sudo apt install lm-sensors -y && sudo sensors-detect
...

sudo sensors

power_meter-acpi-0 Adapter: ACPI interface power1: 23.00 W (interval = 4294967.29 s)

coretemp-isa-0000 Adapter: ISA adapter Package id 0: +28.0�C (high = +80.0�C, crit = +100.0�C) Core 0: +25.0�C (high = +80.0�C, crit = +100.0�C) Core 1: +27.0�C (high = +80.0�C, crit = +100.0�C) Core 2: +26.0�C (high = +80.0�C, crit = +100.0�C) Core 3: +28.0�C (high = +80.0�C, crit = +100.0�C)

acpitz-acpi-0 Adapter: ACPI interface temp1: +27.0�C (crit = +119.0�C) temp2: +27.0�C (crit = +119.0�C)

pch_skylake-virtual-0 Adapter: Virtual device temp1: +32.5�C

i350bb-pci-0200 Adapter: PCI adapter loc1: +54.0�C (high = +120.0�C, crit = +110.0�C)

max_perf_pct is not limited:

# cd /sys/devices/system/cpu/intel_pstate && grep -r .
no_turbo:0
num_pstates:35
status:active
turbo_pct:18
max_perf_pct:100
hwp_dynamic_boost:0
min_perf_pct:100

note: I manually changed, via nano the above min_perf_pct to 100.

I also tried to set GRUB_CMDLINE_LINUX_DEFAULT="intel_pstate=disable" in grub. It made one small difference: the active driver is now acpi-cpufreq, and cpufreq-info now claims that the CPU is running at 3.60 GHz (at it should).... but this is just wrong: as confirmed by any other command I ran, the CPU is still at 800 MHz

# cpufreq-info
cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to cpufreq@vger.kernel.org, please.
analyzing CPU 0:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 10.0 us.
  hardware limits: 800 MHz - 3.60 GHz
  available frequency steps: 3.60 GHz, 3.60 GHz, 3.40 GHz, 3.20 GHz, 3.00 GHz, 2.80 GHz, 2.60 GHz, 2.40 GHz, 2.20 GHz, 2.00 GHz, 1.80 GHz, 1.60 GHz, 1.40 GHz, 1.20 GHz, 1000 MHz, 800 MHz
  available cpufreq governors: conservative, ondemand, userspace, powersave, performance, schedutil
  current policy: frequency should be within 800 MHz and 3.60 GHz.
                  The governor "performance" may decide which speed to use
                  within this range.
  current CPU frequency is 3.60 GHz (asserted by call to hardware).
  cpufreq stats: 3.60 GHz:100.00%, 3.60 GHz:0.00%, 3.40 GHz:0.00%, 3.20 GHz:0.00%, 3.00 GHz:0.00%, 2.80 GHz:0.00%, 2.60 GHz:0.00%, 2.40 GHz:0.00%, 2.20 GHz:0.00%, 2.00 GHz:0.00%, 1.80 GHz:0.00%, 1.60 GHz:0.00%, 1.40 GHz:0.00%, 1.20 GHz:0.00%, 1000 MHz:0.00%, 800 MHz:0.00%  (1)

....

As suggested by Doug in the comments:

# sudo apt install linux-tools-common linux-tools-generic -y
# sudo turbostat --Summary --show Busy%,Bzy_MHz,IRQ,PkgWatt,PkgTmp,RAMWatt,GFXWatt,CorWatt --interval 15
turbostat version 19.08.31 - Len Brown <lenb@kernel.org>
CPUID(0): GenuineIntel 0x16 CPUID levels; 0x80000008 xlevels; family:model:stepping 0x6:9e:9 (6:158:9)
CPUID(1): SSE3 MONITOR SMX EIST TM2 TSC MSR ACPI-TM HT TM
CPUID(6): APERF, TURBO, DTS, PTM, HWP, HWPnotify, HWPwindow, HWPepp, No-HWPpkg, EPB
cpu6: MSR_IA32_MISC_ENABLE: 0x00850089 (TCC EIST MWAIT PREFETCH TURBO)
CPUID(7): SGX
cpu6: MSR_IA32_FEATURE_CONTROL: 0x00000005 (Locked )
CPUID(0x15): eax_crystal: 2 ebx_tsc: 300 ecx_crystal_hz: 0
TSC: 3600 MHz (24000000 Hz * 300 / 2 / 1000000)
CPUID(0x16): base_mhz: 3600 max_mhz: 3600 bus_mhz: 100
cpu6: MSR_MISC_PWR_MGMT: 0x00401cc0 (ENable-EIST_Coordination DISable-EPB DISable-OOB)
RAPL: 4033 sec. Joule Counter Range, at 65 Watts
cpu6: MSR_PLATFORM_INFO: 0x88080838f1012400
8 * 100.0 = 800.0 MHz max efficiency frequency
36 * 100.0 = 3600.0 MHz base frequency
cpu6: MSR_IA32_POWER_CTL: 0x403c005d (C1E auto-promotion: DISabled)
cpu6: MSR_TURBO_RATIO_LIMIT: 0x24242424
36 * 100.0 = 3600.0 MHz max turbo 4 active cores
36 * 100.0 = 3600.0 MHz max turbo 3 active cores
36 * 100.0 = 3600.0 MHz max turbo 2 active cores
36 * 100.0 = 3600.0 MHz max turbo 1 active cores
cpu6: MSR_CONFIG_TDP_NOMINAL: 0x00000024 (base_ratio=36)
cpu6: MSR_CONFIG_TDP_LEVEL_1: 0x00000000 ()
cpu6: MSR_CONFIG_TDP_LEVEL_2: 0x00000000 ()
cpu6: MSR_CONFIG_TDP_CONTROL: 0x80000000 ( lock=1)
cpu6: MSR_TURBO_ACTIVATION_RATIO: 0x00000000 (MAX_NON_TURBO_RATIO=0 lock=0)
cpu6: MSR_PKG_CST_CONFIG_CONTROL: 0x7e008006 (UNdemote-C3, UNdemote-C1, demote-C3, demote-C1, locked, pkg-cstate-limit=6 (pc8))
cpu6: cpufreq driver: intel_pstate
cpu6: cpufreq governor: performance
cpufreq intel_pstate no_turbo: 0
cpu6: MSR_MISC_FEATURE_CONTROL: 0x00000000 (L2-Prefetch L2-Prefetch-pair L1-Prefetch L1-IP-Prefetch)
cpu0: MSR_PM_ENABLE: 0x00000001 (HWP)
cpu0: MSR_HWP_CAPABILITIES: 0x0108242a (high 42 guar 36 eff 8 low 1)
cpu0: MSR_HWP_REQUEST: 0x00002a2a (min 42 max 42 des 0 epp 0x0 window 0x0 pkg 0x0)
cpu0: MSR_HWP_INTERRUPT: 0x00000000 (Dis_Guaranteed_Perf_Change, Dis_Excursion_Min)
cpu0: MSR_HWP_STATUS: 0x00000004 (No-Guaranteed_Perf_Change, No-Excursion_Min)
cpu0: MSR_IA32_ENERGY_PERF_BIAS: 0x00000006 (balanced)
cpu0: MSR_RAPL_POWER_UNIT: 0x000a0e03 (0.125000 Watts, 0.000061 Joules, 0.000977 sec.)
cpu0: MSR_PKG_POWER_INFO: 0x00000208 (65 W TDP, RAPL 0 - 0 W, 0.000000 sec.)
cpu0: MSR_PKG_POWER_LIMIT: 0x8042028a001b8208 (locked)
cpu0: PKG Limit #1: ENabled (65.000000 Watts, 8.000000 sec, clamp ENabled)
cpu0: PKG Limit #2: DISabled (81.250000 Watts, 0.002441* sec, clamp DISabled)
cpu0: MSR_DRAM_POWER_LIMIT: 0x805400de00000000 (UNlocked)
cpu0: DRAM Limit: DISabled (0.000000 Watts, 0.000977 sec, clamp DISabled)
cpu0: MSR_PP0_POLICY: 0
cpu0: MSR_PP0_POWER_LIMIT: 0x00000000 (UNlocked)
cpu0: Cores Limit: DISabled (0.000000 Watts, 0.000977 sec, clamp DISabled)
cpu0: MSR_PP1_POLICY: 0
cpu0: MSR_PP1_POWER_LIMIT: 0x00000000 (UNlocked)
cpu0: GFX Limit: DISabled (0.000000 Watts, 0.000977 sec, clamp DISabled)
cpu0: MSR_IA32_TEMPERATURE_TARGET: 0x00641400 (100 C)
cpu0: MSR_IA32_PACKAGE_THERM_STATUS: 0x884a010c (26 C)
cpu0: MSR_IA32_PACKAGE_THERM_INTERRUPT: 0x00d5c100 (15 C, 35 C)
cpu6: MSR_PKGC3_IRTL: 0x0000884e (valid, 79872 ns)
cpu6: MSR_PKGC6_IRTL: 0x00008876 (valid, 120832 ns)
cpu6: MSR_PKGC7_IRTL: 0x00008894 (valid, 151552 ns)
cpu6: MSR_PKGC8_IRTL: 0x000088fa (valid, 256000 ns)
cpu6: MSR_PKGC9_IRTL: 0x0000894c (valid, 339968 ns)
cpu6: MSR_PKGC10_IRTL: 0x00008bf2 (valid, 1034240 ns)
Busy%   Bzy_MHz IRQ     PkgTmp  PkgWatt CorWatt GFXWatt RAMWatt
4.48    800     16059   26      2.08    0.34    0.00    1.56
5.81    800     17468   26      2.12    0.38    0.00    1.57
5.78    800     15871   26      2.12    0.38    0.00    1.57
4.55    800     13702   26      2.06    0.32    0.00    1.56

Some more stuff I tried:

https://www.reddit.com/r/GarudaLinux/comments/l73vfz/autocpufreq_stuck_at_800_mhz/
# auto-cpufreq --log
auto-cpufreq: command not found

ls -l /sys/class/power_supply/

total 0

https://askubuntu.com/questions/1307773/in-ubuntu-20-10-cpu-clock-fixed-at-800mhz

cat /sys/devices/system/cpu/cpu*/cpufreq/bios_limit

cat: '/sys/devices/system/cpu/cpu*/cpufreq/bios_limit': No such file or directory

service thermald status

Unit thermald.service could not be found.

:/sys/devices/system/cpu/cpu0/cpufreq# grep -r . energy_performance_available_preferences:default performance balance_performance balance_power power scaling_min_freq:800000 scaling_available_governors:performance powersave base_frequency:3600000 scaling_governor:performance cpuinfo_max_freq:4200000 related_cpus:0 scaling_cur_freq:800010 scaling_setspeed:<unsupported> affected_cpus:0 scaling_max_freq:4200000 cpuinfo_transition_latency:0 energy_performance_preference:performance scaling_driver:intel_pstate cpuinfo_min_freq:800000

I've read A LOT of other Q&A here, forum and reddit topics all reporting problems similar to mine, but nothing seems to work.

  • The acpi-cpufreq CPU frequency scaling driver tells you what it is asking for not what you are getting, which is why it says 3.6GHz. Go back to the intel_pstate CPU frequency scaling driver and try disabling hwp, intel_pstate=no_hwp on the grub command line. Note that turbostat (linux-tools-common package, I think) is a good monitoring tool for this type of work. And this command is suggested: sudo turbostat --Summary --quiet --show Busy%,Bzy_MHz,IRQ,PkgWatt,PkgTmp,RAMWatt,GFXWatt,CorWatt --interval 15. Deleting the --quiet option can give insight as to a cause for any CPU throttling. – Doug Smythies Aug 24 '22 at 18:33
  • Hi @DougSmythies , thanks for you suggestion. I added the output of turbostat in my question. At a glance nothing wrong pops up to me. Do you see something I missed? I also tried intel_pstate=no_hwp: no change – Dr. Gianluigi Zane Zanettini Aug 25 '22 at 07:41

1 Answers1

2

From the turbostat header information, these two lines are revealing:

cpu0: MSR_IA32_PACKAGE_THERM_STATUS: 0x884a010c (26 C)
cpu0: MSR_IA32_PACKAGE_THERM_INTERRUPT: 0x00d5c100 (15 C, 35 C)

For unknown reasons your thermal interrupt register (IA32_PACKAGE_THERM_INTERRUPT, 0x1B2) has been set to cause a level 2 over-temperature declaration if the processor package temperature is above 15 degrees C and to cause a level 1 over-temperature declaration if the processor package temperature is above 35 degrees C. The two thresholds are enabled. The offsets are relative to TCC (100 degrees C) and are 85 degrees C and 65 degrees C, so my best guess is that the desired configuration was level 2 = 85 and level 1 = 65. In my opinion 65 is too low, and I would suggest 75. Reverse encoding these desired numbers gives 0X8F9300 instead of 0XD5C100.

You can try to find the configuration mistake and fix it, or as a test, and with all due caution, modify the MSR directly. To modify MSRs (Machine Specific Registers) two things are required:

  • The msr module needs to be loaded, which it will be if turbostat has been run. Otherwise sudo modprobe msr will load it.
  • If your kernel is new enough, you need to enable MSR writes, either via kernel command line during boot, msr.allow_writes=on, or on the fly, echo on | sudo tee /sys/module/msr/parameters/allow_writes. Based on your version of turbostat, your kernel might be older.

The command would be sudo wrmsr 0x1b2 0x8f9300 and to check it use turbostat or sudo rdmsr 0x1b2.

Now, the thermal status register (IA32_PACKAGE_THERM_STATUS, 0x1B1) is indicating that the PROCHOT bit is asserted and there is a level 2 thermal condition. I am assuming the PROCHOT bit is due to the above configuration issue, but I might be incorrect.

Note: I was unable to test this with my processor (I tried), because there are hardware dependencies for proper operation (in my case, my processor did not throttle for 0x1b2 = 0xd5C100).

Reference: Intel® 64 and IA-32 Architectures Software Developer’s Manual

Doug Smythies
  • 15,448
  • 5
  • 44
  • 61
  • Thanks Doug! I changed a few settings in my firmware: Disable PROCHOT# Output, Bi-directional PROCHOT#, Intel SpeedStep, CPU C-States were Enabled, I set them to Disabled and tadaaa, 3.6 GHz https://i.postimg.cc/CxG4cGN2/z-Shot-1661457138.png ! I also ran a few benchmark, and the difference in performance is day and night! I'm a bit worried about the temps, so I'm generating some load: it climbed quickly to 60°, but then it stabilized there. I think we are done! Thanks again! Should you ever come to Ferrara (Italy), don't forget to stop by for a glass of "lambrusco" – Dr. Gianluigi Zane Zanettini Aug 25 '22 at 20:18
  • I would recommend that you re-enable C-States (basically idle) and Intel SpeedStep (CPU frequency scaling). Your processor will run cooler and yet still be responsive to load demands as required. If using the intel_pstate CPU frequnecy scaling driver then the powersave frequency scaling governor should also be good enough. Note that intel_pstate/powersave is not the same as acpi-cpufreq/powersave. You might also want to try to fix or original setup or at least test that you have some thermal protection against real over-temperature. – Doug Smythies Aug 25 '22 at 20:27