Azure Disk performance by region

Solution Unverified - Updated 6 May 2024

Environment

Red Hat OpenShift Container Platform (RHOCP)
- 4.6, 4.7
Azure Platform

Issue

Azure Disk performance is known to vary by region.
Azure Disk performance issues manifest themselves as etcd performance issues such as leader election changes, high fsync latency durations, and API server failures.

Resolution

Choosing a region with better performance will result in better cluster stability. Refer to etcd backend performance requirements for OpenShift for additional information about the etcd requiremets.

The Azure platform is currently working on reducing storage latency by rolling out updates that improve storage performance.

Microsoft region storage upgrade status:

Completed	In Progress
US West, US West Central, France Central, Canada Central, Brazil South, Asia Southeast etc, both Norway regions	North Europe ~30% complete

Root Cause

Azure disk performance has high latency. Etcd requires performance of its 99th percentile fsync latency to be sub 10 milliseconds.

Here is a breakdown of fsync times using a simple test with the fio command line tool:

Region	Fsync 99	Fsync 99.99	Fsync Avg	Fsync Max
uaenorth	5.59	20.88	2.46	37.71
eastasia	6.42	18.58	2.60	44.49
southeastasia	6.75	16.75	2.48	24.32
canadacentral	6.78	26.75	3.09	36.58
norwayeast	7.00	22.18	2.91	29.20
koreasouth	7.07	21.99	3.24	31.38
southindia	7.25	20.07	3.10	31.27
japaneast	7.32	22.65	3.02	30.29
ukwest	7.41	21.45	3.03	29.58
switzerlandnorth	7.50	19.77	2.60	24.81
northcentralus	7.55	20.94	3.07	28.68
westindia	7.56	18.40	2.94	24.34
japanwest	7.63	20.86	3.27	40.93
centralindia	7.91	19.42	3.12	26.21
australiacentral	7.95	19.62	2.83	22.99
germanywestcentral	8.01	25.64	4.16	32.85
southcentralus	8.15	20.76	3.19	27.72
westcentralus	8.40	17.27	2.32	27.95
westus2	8.48	20.15	3.06	28.23
westeurope	8.78	22.65	3.21	28.30
southafricanorth	8.92	23.45	4.33	32.06
brazilsouth	9.21	31.23	3.26	42.38
koreacentral	9.32	24.96	4.25	54.67
canadaeast	9.34	23.70	3.29	49.34
uksouth	9.55	22.08	3.04	26.22
northeurope	9.80	25.31	3.46	38.60
australiaeast	9.83	28.70	4.07	56.37
eastus	10.49	25.46	3.48	37.32
francecentral	10.62	24.59	3.32	34.76
westus	11.45	27.22	4.19	34.92
centralus	15.18	59.21	5.14	112.97

Important: The fio test is a short test executed at specific moment. It can show if the disk is fast enough to support the etcd requirements, but other loads in the disk could cause that etcd don't behaves correctly. Review also the etcd metrics to know the real etcd behavior as shown in How to graph etcd metrics using Prometheus to gauge Etcd performance in OpenShift.

Diagnostic Steps

Disclaimer: Links contained herein to external website(s) are provided for convenience only. Red Hat has not reviewed the links and is not responsible for the content or its availability. The inclusion of any link to an external website does not imply endorsement by Red Hat of the website or their entities, products or services. You agree that Red Hat is not responsible or liable for any loss or expenses that may result due to your use of (or reliance on) the external site or content.

Spin up an OpenShift cluster with default configuration:

Standard_D8s_v3 (8 vcpus, 32 GiB memory)
Data disk: 1 TB Premium SSD (P30) 5000 IOPS / 200 MB/s
Caching set to `ReadOnly`.

Perform the following command on the cluster:

oc debug node/$NODE --as-root=true --image ljishen/fio -- sh -c '/bin/rm -f /host/var/lib/etcd/etcd && /usr/local/bin/fio --rw=write --fdatasync=1 --size=22m --bs=2300 --name=etcd1 --ioengine=sync --directory=/host/var/lib/etcd/ --filename=etcd'

Look at the results and find the section labeled fsync/fdatasync/sync_file_range. This section includes a text version of a histogram. The values represent the percentage of completed synchronizations or flushes of file data to the storage device (man fsync for more info). The values found in this section are in microseconds (usec) and can be converted to milliseconds by multiplying by 1000.

In the example below, that section gives a 99.00th percentile of requests completed in 10.945 ms. This performance would be considered higher than acceptable according to the Content from github.com is not included.etcd documentation. Refer to Content from www.ibm.com is not included.Using Fio to Tell Whether Your Storage is Fast Enough for Etcd and etcd backend performance requirements for OpenShift for additional information.

This is example output from the westus region in Azure:

fio-3.6
Starting 1 process
etcd1: Laying out IO file (1 file / 22MiB)

etcd1: (groupid=0, jobs=1): err= 0: pid=48993: Mon Mar 22 16:21:00 2021
  write: IOPS=243, BW=546KiB/s (559kB/s)(21.0MiB/41246msec)
    clat (usec): min=5, max=943, avg=14.31, stdev=11.92
     lat (usec): min=6, max=944, avg=15.51, stdev=12.05
    clat percentiles (usec):
     |  1.00th=[    8],  5.00th=[   10], 10.00th=[   11], 20.00th=[   11],
     | 30.00th=[   12], 40.00th=[   13], 50.00th=[   13], 60.00th=[   14],
     | 70.00th=[   16], 80.00th=[   17], 90.00th=[   19], 95.00th=[   23],
     | 99.00th=[   39], 99.50th=[   46], 99.90th=[   86], 99.95th=[  108],
     | 99.99th=[  445]
   bw (  KiB/s): min=  422, max=  615, per=99.90%, avg=545.45, stdev=39.37, samples=82
   iops        : min=  188, max=  274, avg=243.01, stdev=17.55, samples=82
  lat (usec)   : 10=9.84%, 20=83.23%, 50=6.59%, 100=0.27%, 250=0.05%
  lat (usec)   : 500=0.01%, 1000=0.01%
  fsync/fdatasync/sync_file_range:
    sync (usec): min=1240, max=23698, avg=4088.49, stdev=2321.18
    sync percentiles (usec):
     |  1.00th=[ 1385],  5.00th=[ 1500], 10.00th=[ 1565], 20.00th=[ 1680],
     | 30.00th=[ 1860], 40.00th=[ 2999], 50.00th=[ 4555], 60.00th=[ 4883],
     | 70.00th=[ 5145], 80.00th=[ 5669], 90.00th=[ 6783], 95.00th=[ 7963],
     | 99.00th=[10945], 99.50th=[12387], 99.90th=[15533], 99.95th=[19006],
     | 99.99th=[22414]
  cpu          : usr=0.51%, sys=2.07%, ctx=32364, majf=0, minf=12
  IO depths    : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,10029,0,0 short=10029,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=546KiB/s (559kB/s), 546KiB/s-546KiB/s (559kB/s-559kB/s), io=21.0MiB (23.1MB), run=41246-41246msec

Disk stats (read/write):
    dm-0: ios=0/22754, merge=0/0, ticks=0/61604, in_queue=61604, util=55.19%, aggrios=0/22693, aggrmerge=0/118, aggrticks=0/60456, aggrin_queue=47888, aggrutil=55.13%
  sda: ios=0/22693, merge=0/118, ticks=0/60456, in_queue=47888, util=55.13%

SBR

Shift

Product(s)

Red Hat OpenShift Container Platform

Components

etcd

Category

Supportability

Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.