Why the `sar(1)` tool reports `%vmeff` values beyond 100 % in RHEL 8 and RHEL 9?
Environment
- Red Hat Enterprise Linux 9
- Red Hat Enterprise Linux 8
- Sysstat
Issue
- The
sar -Bcommand shipped with RHEL 8 and RHEL 9 is reporting the%vmeffvalues beyond 100 %.
Resolution
-
The issue was being tracked in private Jira for RHEL 8 This content is not included.12008 and RHEL 9 This content is not included.12009 but they are closed for both the RHEL versions as it has been included in the known issues.
-
Moreover, it is marked as a deprecated functionality because the
%vmeffcolumn has been removed in the upstream. -
To work around this problem, calculate the
%vmeffvalue from the/proc/vmstatfile manually. For every time interval, do the following:- To get the PGSCAN_SUM, add all values with the
pgscanprefix from the/proc/vmstatfile. - To get the PGSTEAL_SUM, add all values with the
pgstealprefix from the/proc/vmstatfile. - To get the PGSTEAL_DIFF, calculate the difference between the PGSTEAL_SUM current value and the PGSTEAL_SUM from the previous time interval.
- To get the PGSCAN_DIFF, calculate the difference between the PGSCAN_SUM current values and the PGSCAN_SUM from the previous time interval.
- If PGSCAN_DIFF equals zero, the corresponding
%vmeffvalue is 0 %. - Otherwise, divide PGSTEAL_DIFF by PGSCAN_DIFF and multiply the result by hundred: (PGSTEAL_DIFF / PGSCAN_DIFF) * 100
- If PGSCAN_DIFF equals zero, the corresponding
- To get the PGSCAN_SUM, add all values with the
Root Cause
-
The
sysstatpackage provides the%vmeffmetric to measure the page reclaim efficiency. The values of the%vmeffcolumn returned by thesar -Bcommand are incorrect becausesysstatdoes not parse all relevant/proc/vmstatvalues provided by later kernel versions. -
sysstatparses the/proc/vmstatfile and sums all values with thepgscan_directandpgscan_kswapdprefixes which then correspond to thepgscankandpgscandcolumns produced bysar, respectively. However, these files contain differentpgscanfields on RHEL 7, RHEL 8 and RHEL 9:
RHEL 7:
# grep pgscan /proc/vmstat | cut -d' ' -f1
pgscan_kswapd_dma
pgscan_kswapd_dma32
pgscan_kswapd_normal
pgscan_kswapd_movable
pgscan_direct_dma
pgscan_direct_dma32
pgscan_direct_normal
pgscan_direct_movable
pgscan_direct_throttle
RHEL 8 and RHEL 9:
# grep pgscan /proc/vmstat | cut -d' ' -f1
pgscan_kswapd
pgscan_direct
pgscan_direct_throttle
pgscan_anon
pgscan_file
- The
pgscan_anonandpgscan_filefields on later version of kernels are ignored bysysstat, which is the reason why the number of stolen pages may be higher than the number of scanned pages. Consequently, the%vmeffcolumn shows values beyond 100 %.
Diagnostic Steps
- Observe the values reported by the
sar -Bcommand in the%vmeffcolumn:
# sar -B 1
Linux 5.14.0-162.18.1.el9_1.x86_64 (9rhel) 08/07/2023 _x86_64_ (1 CPU)
03:15:43 PM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff
[--snip--]
03:16:35 PM 170152.00 333524.00 96594.00 5925.00 106470.00 191054.00 326.00 199602.00 104.30 <<------
03:16:36 PM 296388.12 270954.46 80915.84 9732.67 84606.93 149256.44 0.00 157851.49 105.76 <<------
[--snip--]
03:18:43 PM 14520.00 0.00 113780.00 654.00 106366.00 0.00 8.00 16.00 200.00 <<------
03:18:44 PM 916.00 478216.00 146704.00 16.00 120566.00 308362.00 148410.00 219812.00 48.12
03:18:45 PM 32.00 379616.00 91711.00 3.00 100174.00 141552.00 45921.00 183160.00 97.70
03:18:46 PM 18028.00 257964.00 72783.00 619.00 83494.00 266040.00 93165.00 153318.00 42.68
03:18:47 PM 19960.00 189264.00 44060.00 772.00 52249.00 79897.00 47089.00 91628.00 72.16
03:18:48 PM 48648.00 298012.00 47504.00 1863.00 59870.00 195063.00 105508.00 105108.00 34.97
03:18:49 PM 276904.00 186408.00 75258.00 9228.00 77581.00 118321.00 0.00 143212.00 121.04 <<------
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.