How to lock all CPUs to C-State 0 with the cpu-partitioning profile in Red Hat Enterprise Linux and Red Hat OpenStack Platform?
Environment
Red Hat Enterprise Linux 7
Red Hat OpenStack Platform 10
Issue
How to lock all CPUs to C-State 0 with the cpu-partitioning profile in Red Hat Enterprise Linux and Red Hat OpenStack Platform?
When using cpu-partitioning, CPUs still go into C1:
[root@overcloud-compute-0 ~]# tuned-adm profile cpu-partitioning
[root@overcloud-compute-0 ~]# turbostat sleep 5
5.001639 sec
Core CPU Avg_MHz Busy% Bzy_MHz TSC_MHz IRQ SMI CPU%c1 CPU%c3 CPU%c6 CPU%c7 CoreTmp PkgTmp PkgWatt RAMWatt PKG_% RAM_%
- - 242 10.04 2400 2209 0 0 89.96 0.00 0.00 0.00 61 65 64.87 9.67 0.00 0.00
0 0 2430 100.00 2400 2228 0 0 0.00 0.00 0.00 0.00 59 65 35.04 3.92 0.00 0.00
0 20 2422 100.00 2400 2220 0 0 0.00
1 2 26 1.06 2400 2220 0 0 98.94 0.00 0.00 0.00 57
1 22 0 0.00 2401 2220 0 0 100.00
2 4 0 0.00 2400 2220 0 0 100.00 0.00 0.00 0.00 57
2 24 0 0.00 2400 2220 0 0 100.00
3 6 0 0.00 2401 2220 0 0 100.00 0.00 0.00 0.00 58
3 26 0 0.00 2402 2220 0 0 100.00
4 8 0 0.00 2402 2220 0 0 100.00 0.00 0.00 0.00 57
4 28 0 0.00 2400 2220 0 0 100.00
8 10 0 0.00 2401 2220 0 0 100.00 0.00 0.00 0.00 57
8 30 0 0.00 2403 2220 0 0 100.00
9 12 0 0.00 2402 2220 0 0 100.00 0.00 0.00 0.00 57
9 32 0 0.00 2403 2220 0 0 100.00
10 14 0 0.00 2401 2220 0 0 100.00 0.00 0.00 0.00 56
10 34 0 0.00 2401 2220 0 0 100.00
Resolution
Note: This article shows what can be achieved by using a customized tuned profile. It does by no means embrace nor recommend the following settings. Locking all CPUs to C0 has negative consequences, such as higher power consumption and increased generation of heat, among others. Locking all CPUs to C0 might void CPU warranty. Please check with the respective hardware vendor.
Based on How to create a customized tuned profile (This content is not included.This content is not included.https://access.redhat.com/solutions/1305833), one can easily create a customized cpu-partitioning profile which locks all CPUs to C0.
Create the profile:
mkdir /etc/tuned/cpu-partitioning-disable-cstates/
cat <<'EOF'>/etc/tuned/cpu-partitioning-disable-cstates/tuned.conf
[main]
include=cpu-partitioning
[cpu]
force_latency=0
EOF
Enable the profile:
tuned-adm profile cpu-partitioning-disable-cstates
Note: if you got Cannot load profile 'cpu-partitioning-disable-cstates': Cannot find profile 'cpu-partitioning' in '['/etc/tuned', '/usr/lib/tuned']' ,
Install tuned-profiles-cpu-partitioning.
yum install tuned-profiles-cpu-partitioning
Note2:if you also got Cannot load profile 'cpu-partitioning-disable-cstates': Assertion 'isolated_cores are set' failed.,
Please check /etc/tuned/cpu-partitioning-variables.conf and set proper cores with isolated_cores.
grep proc /proc/cpuinfo
vim /etc/tuned/cpu-partitioning-variables.conf
Root Cause
Tuned uses PMQOS interface for C states locking which behaves differently than the parameters passed through the kernel command
line. The Tuned configuration parameter force_latency specifies max latency in us (microseconds) required. This parameter is taken into account by the *_idle driver and if the C state latency is higher than the Tuned force_latency settings the idle driver will not allow the CPU to transition to the C state.
For further details, see This content is not included.This content is not included.https://bugzilla.redhat.com/show_bug.cgi?id=1556724#c15
the acpi_idle driver doesn't allow locking to C0, i.e. the effect of the following two boot command lines is the same:
processor.max_cstate=0 intel_idle.max_cstate=0
processor.max_cstate=1 intel_idle.max_cstate=0
i.e.:
intel_idle.max_cstate=0 disables intel_idle driver and acpi_idle driver is used
processor.max_cstate=0 if acpi_idle driver is used, it silently changes max_cstate=0 to max_cstate=1, so C1 is used always.
Adding 'idle=poll' will add polling loop which will not allow CPU to sleep, thus it will appear as locked in C0. But this will significantly increase power consumption and the CPU may become very hot. In the kernel documentation there is written that this setting is "not recommended" (idle=poll in e.g. [1]).
Tuned uses different mechanism for locking C-states - the PMQoS. It seems it allows locking CPU to C0, but if the acpi_idle driver doesn't allow it, there is probably good reason for not doing it. There is also a possibility that this behavior is a bug in the acpi_idle driver and that it shouldn't allow PMQoS to force C0.
In Tuned we switched to C1 due to bug 1013085. IIRC there was some feedback from Intel regarding this, but I can't remember more at the moment. Jeremy can probably say more about this change.
If you really need locking to C0, you can customize the Tuned profile. It's easy and documented - there is article about it [2].
Diagnostic Steps
Compare the cpu-partitioning profile to cpu-partitioning-disable-cstates:
[root@overcloud-compute-0 ~]# tuned-adm profile cpu-partitioning
[root@overcloud-compute-0 ~]# turbostat sleep 5
5.001639 sec
Core CPU Avg_MHz Busy% Bzy_MHz TSC_MHz IRQ SMI CPU%c1 CPU%c3 CPU%c6 CPU%c7 CoreTmp PkgTmp PkgWatt RAMWatt PKG_% RAM_%
- - 242 10.04 2400 2209 0 0 89.96 0.00 0.00 0.00 61 65 64.87 9.67 0.00 0.00
0 0 2430 100.00 2400 2228 0 0 0.00 0.00 0.00 0.00 59 65 35.04 3.92 0.00 0.00
0 20 2422 100.00 2400 2220 0 0 0.00
1 2 26 1.06 2400 2220 0 0 98.94 0.00 0.00 0.00 57
1 22 0 0.00 2401 2220 0 0 100.00
2 4 0 0.00 2400 2220 0 0 100.00 0.00 0.00 0.00 57
2 24 0 0.00 2400 2220 0 0 100.00
3 6 0 0.00 2401 2220 0 0 100.00 0.00 0.00 0.00 58
3 26 0 0.00 2402 2220 0 0 100.00
4 8 0 0.00 2402 2220 0 0 100.00 0.00 0.00 0.00 57
4 28 0 0.00 2400 2220 0 0 100.00
8 10 0 0.00 2401 2220 0 0 100.00 0.00 0.00 0.00 57
8 30 0 0.00 2403 2220 0 0 100.00
9 12 0 0.00 2402 2220 0 0 100.00 0.00 0.00 0.00 57
9 32 0 0.00 2403 2220 0 0 100.00
10 14 0 0.00 2401 2220 0 0 100.00 0.00 0.00 0.00 56
10 34 0 0.00 2401 2220 0 0 100.00
11 16 0 0.00 2401 2220 0 0 100.00 0.00 0.00 0.00 56
11 36 0 0.00 2402 2220 0 0 100.00
12 18 0 0.00 2402 2220 0 0 100.00 0.00 0.00 0.00 56
12 38 0 0.00 2402 2220 0 0 100.00
0 1 2403 100.00 2400 2203 0 0 0.00 0.00 0.00 0.00 61 64 29.83 5.75 0.00 0.00
0 21 2397 100.00 2400 2198 0 0 0.00
1 3 0 0.00 2402 2198 0 0 100.00 0.00 0.00 0.00 59
1 23 0 0.00 2401 2198 0 0 100.00
2 5 0 0.00 2401 2198 0 0 100.00 0.00 0.00 0.00 59
2 25 0 0.00 2401 2198 0 0 100.00
3 7 0 0.00 2401 2198 0 0 100.00 0.00 0.00 0.00 58
3 27 0 0.00 2385 2198 0 0 100.00
4 9 0 0.00 2402 2197 0 0 100.00 0.00 0.00 0.00 58
4 29 0 0.00 2403 2197 0 0 100.00
8 11 0 0.00 2402 2197 0 0 100.00 0.00 0.00 0.00 58
8 31 0 0.00 2401 2197 0 0 100.00
9 13 0 0.00 2400 2197 0 0 100.00 0.00 0.00 0.00 57
9 33 0 0.00 2401 2197 0 0 100.00
10 15 0 0.00 2402 2197 0 0 100.00 0.00 0.00 0.00 59
10 35 0 0.00 2401 2197 0 0 100.00
11 17 0 0.00 2401 2197 0 0 100.00 0.00 0.00 0.00 57
11 37 0 0.00 2401 2197 0 0 100.00
12 19 0 0.00 2400 2197 0 0 100.00 0.00 0.00 0.00 58
12 39 0 0.02 2400 2197 0 0 99.98
[root@overcloud-compute-0 ~]# tuned-adm profile cpu-partitioning-disable-cstates
[root@overcloud-compute-0 ~]# turbostat sleep 5
5.001900 sec
Core CPU Avg_MHz Busy% Bzy_MHz TSC_MHz IRQ SMI CPU%c1 CPU%c3 CPU%c6 CPU%c7 CoreTmp PkgTmp PkgWatt RAMWatt PKG_% RAM_%
- - 2411 100.00 2400 2210 0 0 0.00 0.00 0.00 0.00 63 67 86.52 9.61 0.00 0.00
0 0 2426 100.00 2400 2224 0 0 0.00 0.00 0.00 0.00 61 67 45.71 3.90 0.00 0.00
0 20 2425 100.00 2400 2223 0 0 0.00
1 2 2425 100.00 2400 2223 0 0 0.00 0.00 0.00 0.00 61
1 22 2425 100.00 2400 2223 0 0 0.00
2 4 2425 100.00 2400 2223 0 0 0.00 0.00 0.00 0.00 62
2 24 2425 100.00 2400 2223 0 0 0.00
3 6 2425 100.00 2400 2223 0 0 0.00 0.00 0.00 0.00 61
3 26 2425 100.00 2400 2223 0 0 0.00
4 8 2425 100.00 2400 2223 0 0 0.00 0.00 0.00 0.00 61
4 28 2425 100.00 2400 2223 0 0 0.00
8 10 2425 100.00 2400 2223 0 0 0.00 0.00 0.00 0.00 60
8 30 2425 100.00 2400 2223 0 0 0.00
9 12 2425 100.00 2400 2223 0 0 0.00 0.00 0.00 0.00 61
9 32 2425 100.00 2400 2223 0 0 0.00
10 14 2425 100.00 2400 2223 0 0 0.00 0.00 0.00 0.00 61
10 34 2425 100.00 2400 2223 0 0 0.00
11 16 2425 100.00 2400 2223 0 0 0.00 0.00 0.00 0.00 61
11 36 2425 100.00 2400 2223 0 0 0.00
12 18 2425 100.00 2400 2223 0 0 0.00 0.00 0.00 0.00 59
12 38 2425 100.00 2400 2223 0 0 0.00
0 1 2403 100.00 2400 2202 0 0 0.00 0.00 0.00 0.00 63 65 40.81 5.71 0.00 0.00
0 21 2397 100.00 2400 2198 0 0 0.00
1 3 2397 100.00 2400 2198 0 0 0.00 0.00 0.00 0.00 63
1 23 2397 100.00 2400 2198 0 0 0.00
2 5 2397 100.00 2400 2198 0 0 0.00 0.00 0.00 0.00 63
2 25 2397 100.00 2400 2198 0 0 0.00
3 7 2397 100.00 2400 2198 0 0 0.00 0.00 0.00 0.00 61
3 27 2397 100.00 2400 2198 0 0 0.00
4 9 2397 100.00 2400 2198 0 0 0.00 0.00 0.00 0.00 62
4 29 2397 100.00 2400 2198 0 0 0.00
8 11 2397 100.00 2400 2197 0 0 0.00 0.00 0.00 0.00 61
8 31 2397 100.00 2400 2197 0 0 0.00
9 13 2397 100.00 2400 2197 0 0 0.00 0.00 0.00 0.00 61
9 33 2397 100.00 2400 2197 0 0 0.00
10 15 2397 100.00 2400 2197 0 0 0.00 0.00 0.00 0.00 62
10 35 2397 100.00 2400 2197 0 0 0.00
11 17 2397 100.00 2400 2197 0 0 0.00 0.00 0.00 0.00 62
11 37 2397 100.00 2400 2197 0 0 0.00
12 19 2397 100.00 2400 2197 0 0 0.00 0.00 0.00 0.00 62
12 39 2397 100.00 2400 2197 0 0 0.00
[root@overcloud-compute-0 ~]# grep -i affi /etc/systemd/system.conf
#CPUAffinity=1 2
CPUAffinity=0 1 20 21
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.