How to benchmark a gfs2 filesystem?
Introduction
This article explains how to benchmark a gfs2 filesystem on RHEL 6 and RHEL 7 with a utility called `create_unlink_perf` .
Environment
- Red Hat Enterprise Linux Server 6 (with the High Availability and Resilient Storage Add Ons)
- Red Hat Enterprise Linux Server 7 (with the High Availability and Resilient Storage Add Ons)
- A Global Filesystem 2(
gfs2)
How to benchmark a gfs2 filesystem with create_unlink_perf
A benchmark test program has been created to help simulate some of the operations that will occur on a gfs2 filesystem. The benchmark test program is called `create_unlink_perf` and there are pre-built binaries:
- This content is not included.RHEL 6 create_unlink_perf
- This content is not included.RHEL 7 create_unlink_perf
NOTE: The benchmark test program create_unlink_perf is not officially supported by Red Hat. It is provided and documented as a convenience for diagnostic purposes, but it carries no guarantees about its behavior and is to be used at the administrator's own risk.
The test is designed to factor out multiple node versus single node. The program flushes caches on all nodes in the cluster between each test, The test will be run from a single cluster node and then ssh into the cluster nodes specified and will run the same script create_unlink_perf to do the work. The existing gfs2 filesystems that are already mounted and being benchmarked do not have to be unmounted while the benchmark test program runs.
The benchmark test program create_unlink_perf will cause the load on the cluster nodes to increase that could cause a negative performance impact until the program is finished running.
What we're testing and why?
- Serial tests: The serial tests are done to see if each node in the cluster performs the same as the others. If some nodes are significantly slower, it may indicate hardware problems, such as faulty ports on the Ethernet switch, faulty ports on a Fibre channel switch, Faulty ethernet cable, faulty fibre channel cable, faulty host bus adapter (HBA). Even the motherboard or memory on slower nodes are open to suspicion. If all nodes perform slow, it could indicate a central point of contention: slow or faulty ethernet switch, slow storage array, or slow device configuration.
- Parallel tests: The parallel tests are done to see if tests in parallel are comparable to serial tests. In many cases, the tests may run slower, due to lock contention issues, and the fact that the storage array is doing more work simultaneously. In the tests run faster, it could indicate network routing issues, since more packets are sent through the network, thus allowing more throughput. (Spending too much time waiting for network packets to fill up before sending them, etc.)
- Unlink tests: These are intended to test whether file create operations are interfering with unlink operations. A lot of these variations go back to testing the contention of GFS2 resource group glocks. The creates will establish resource group locks on different nodes. Later, the unlinks will take advantage of the locks. In a way, this tests GFS2 lock caching and trading the locks through Distributed Lock Manager (DLM).
How long does it take to run create_unlink_perf?
The benchmark test program `create_unlink_perf` usually takes around 15 to 30 minutes to complete, but can vary based on hardware, software, load on the cluster nodes.
Requirements for
create_unlink_perf
- The benchmark test program requires that port
62001/tcpis opened. - The benchmark test program should be located in the following directory:
~/binon all cluster nodes that will be running the script. The cluster nodes that will run the script are passed as options to the script. - The benchmark test program requires that the gfs2 filesystem has at least 50GB of free space.
- The benchmark test program requires that the
rootuser can login with This content is not included.sshkeys into each cluster node. to log into each cluster node. The benchmark test program requires therootuser because the cache is dropped between each test which can only be done by therootuser.
Recommendations when running
create_unlink_perf
- If possible, create a new gfs2 filesystem on the same storage array as the other gfs2 filesystems. If you run the program on gfs2 filesystem from a different storage array, the results could be skewed (better or worse results). One of the reasons a fresh or new gfs2 filesystem should be used is because an existing gfs2 filesystem will contain some (or a lot) of fragmentation that will impact the results of the script.
- When the benchmark test program is ran it should be done when the load is the low on the host and all nodes in the cluster.
- The cache does not need to be flushed before the benchmark test program is ran because the script will do that before starting each test.
Example command of running the script
This command is run from a single cluster node and will perform the tests on the 3 nodes of the cluster that has the gfs2 filesystem mounted `/mnt/vg_shared_2-lvol1`.
# ~/bin/create_unlink_perf -m /mnt/vg_shared_2-lvol1 rhel6-1.examplerh.com rhel6-2.examplerh.com rhel6-3.examplerh.com
A regular expression can be used for the cluster node names.
# ~/bin/create_unlink_perf -m /mnt/vg_shared_2-lvol1 rhel6-{1..3}.examplerh.com
Before running the benchmark test program create_unlink_perf make sure that any running process is killed before starting a new test.
# for hostname in rhel7-1.examplerh.com rhel7-2.examplerh.com rhel7-3.examplerh.com; do ssh $hostname "killall create_unlink_perf"; done
# for hostname in rhel7-{1..3}.examplerh.com; do ssh $hostname "killall create_unlink_perf"; done
Example usage and output
Here is an example of running the tool and the output it provides:
# ~/bin/create_unlink_perf -i2 -m /mnt/gfs2 gfs-02-lp0{1..5}
GFS2 mount point: /mnt/gfs2
Testing with 5 nodes, 1 workers per node:
gfs-02-lp01 10.16.145.43
gfs-02-lp02 10.16.145.44
gfs-02-lp03 10.16.145.45
gfs-02-lp04 10.16.145.46
gfs-02-lp05 10.16.145.47
Test It Node name Work TestType Microsecs Files us/fil files/sec
---- -- -------------- ---- -------- ---------- ------ ------ ---------
se 1 gfs-p8-02-lp01 1 Create 11565202 50000 231 4329
se 1 gfs-p8-02-lp02 1 Create 11234168 50000 224 4464
se 1 gfs-p8-02-lp03 1 Create 12841272 50000 256 3906
se 1 gfs-p8-02-lp04 1 Create 12726421 50000 254 3937
se 1 gfs-p8-02-lp05 1 Create 13049616 50000 260 3846
se 1 gfs-p8-02-lp01 1 Write 27094287 2500 10837 92
se 1 gfs-p8-02-lp02 1 Write 28172852 2500 11269 88
se 1 gfs-p8-02-lp03 1 Write 27351127 2500 10940 91
se 1 gfs-p8-02-lp04 1 Write 27770857 2500 11108 90
se 1 gfs-p8-02-lp05 1 Write 27263500 2500 10905 91
se 1 gfs-p8-02-lp01 1 Read 29497625 2500 11799 84
se 1 gfs-p8-02-lp02 1 Read 28024397 2500 11209 89
se 1 gfs-p8-02-lp03 1 Read 21175343 2500 8470 118
se 1 gfs-p8-02-lp04 1 Read 21256996 2500 8502 117
se 1 gfs-p8-02-lp05 1 Read 21146432 2500 8458 118
se 1 gfs-p8-02-lp01 1 Unlink 21151534 50000 423 2364
se 1 gfs-p8-02-lp02 1 Unlink 29758143 50000 595 1680
se 1 gfs-p8-02-lp03 1 Unlink 23810108 50000 476 2100
se 1 gfs-p8-02-lp04 1 Unlink 23022839 50000 460 2173
se 1 gfs-p8-02-lp05 1 Unlink 23175225 50000 463 2159
se 2 gfs-p8-02-lp01 1 Create 11403309 50000 228 4385
se 2 gfs-p8-02-lp02 1 Create 11777785 50000 235 4255
se 2 gfs-p8-02-lp03 1 Create 12876572 50000 257 3891
se 2 gfs-p8-02-lp04 1 Create 11355958 50000 227 4405
se 2 gfs-p8-02-lp05 1 Create 12924520 50000 258 3875
se 2 gfs-p8-02-lp01 1 Write 25268280 2500 10107 98
se 2 gfs-p8-02-lp02 1 Write 25227021 2500 10090 99
se 2 gfs-p8-02-lp03 1 Write 25633473 2500 10253 97
se 2 gfs-p8-02-lp04 1 Write 25169425 2500 10067 99
se 2 gfs-p8-02-lp05 1 Write 25548858 2500 10219 97
se 2 gfs-p8-02-lp01 1 Read 36096831 2500 14438 69
se 2 gfs-p8-02-lp02 1 Read 36865673 2500 14746 67
se 2 gfs-p8-02-lp03 1 Read 25688770 2500 10275 97
se 2 gfs-p8-02-lp04 1 Read 26492917 2500 10597 94
se 2 gfs-p8-02-lp05 1 Read 28204852 2500 11281 88
se 2 gfs-p8-02-lp01 1 Unlink 23923278 50000 478 2092
se 2 gfs-p8-02-lp02 1 Unlink 24234492 50000 484 2066
se 2 gfs-p8-02-lp03 1 Unlink 27992758 50000 559 1788
se 2 gfs-p8-02-lp04 1 Unlink 24126760 50000 482 2074
se 2 gfs-p8-02-lp05 1 Unlink 25289334 50000 505 1980
pa 1 gfs-p8-02-lp05 1 Create 8293381 50000 165 6060
pa 1 gfs-p8-02-lp01 1 Create 9284993 50000 185 5405
pa 1 gfs-p8-02-lp04 1 Create 9343951 50000 186 5376
pa 1 gfs-p8-02-lp02 1 Create 9489407 50000 189 5291
pa 1 gfs-p8-02-lp03 1 Create 10442297 50000 208 4807
pa 1 gfs-p8-02-lp03 1 Write 42931324 2500 17172 58
pa 1 gfs-p8-02-lp01 1 Write 44259051 2500 17703 56
pa 1 gfs-p8-02-lp02 1 Write 46674285 2500 18669 53
pa 1 gfs-p8-02-lp04 1 Write 51370666 2500 20548 48
pa 1 gfs-p8-02-lp05 1 Write 55239160 2500 22095 45
pa 1 gfs-p8-02-lp01 1 Read 44368335 2500 17747 56
pa 1 gfs-p8-02-lp03 1 Read 55518490 2500 22207 45
pa 1 gfs-p8-02-lp04 1 Read 99670401 2500 39868 25
pa 1 gfs-p8-02-lp05 1 Read 100495228 2500 40198 24
pa 1 gfs-p8-02-lp02 1 Read 102270637 2500 40908 24
pa 1 gfs-p8-02-lp03 1 Unlink 25632895 50000 512 1953
pa 1 gfs-p8-02-lp01 1 Unlink 26784749 50000 535 1869
pa 1 gfs-p8-02-lp05 1 Unlink 27361223 50000 547 1828
pa 1 gfs-p8-02-lp04 1 Unlink 28307227 50000 566 1766
pa 1 gfs-p8-02-lp02 1 Unlink 28455788 50000 569 1757
pa 2 gfs-p8-02-lp04 1 Create 8373193 50000 167 5988
pa 2 gfs-p8-02-lp05 1 Create 9188935 50000 183 5464
pa 2 gfs-p8-02-lp02 1 Create 9291248 50000 185 5405
pa 2 gfs-p8-02-lp03 1 Create 9519569 50000 190 5263
pa 2 gfs-p8-02-lp01 1 Create 10273337 50000 205 4878
pa 2 gfs-p8-02-lp03 1 Write 42797228 2500 17118 58
pa 2 gfs-p8-02-lp01 1 Write 43268962 2500 17307 57
pa 2 gfs-p8-02-lp05 1 Write 43749028 2500 17499 57
pa 2 gfs-p8-02-lp04 1 Write 46942419 2500 18776 53
pa 2 gfs-p8-02-lp02 1 Write 47275792 2500 18910 52
pa 2 gfs-p8-02-lp01 1 Read 39041065 2500 15616 64
pa 2 gfs-p8-02-lp03 1 Read 44389425 2500 17755 56
pa 2 gfs-p8-02-lp04 1 Read 67968040 2500 27187 36
pa 2 gfs-p8-02-lp05 1 Read 68962362 2500 27584 36
pa 2 gfs-p8-02-lp02 1 Read 73508875 2500 29403 34
pa 2 gfs-p8-02-lp04 1 Unlink 24611932 50000 492 2032
pa 2 gfs-p8-02-lp03 1 Unlink 26818575 50000 536 1865
pa 2 gfs-p8-02-lp05 1 Unlink 27159899 50000 543 1841
pa 2 gfs-p8-02-lp01 1 Unlink 27888112 50000 557 1795
pa 2 gfs-p8-02-lp02 1 Unlink 30131531 50000 602 1661
ua 1 gfs-p8-02-lp01 1 Create 7169713 50000 143 6993
ua 1 gfs-p8-02-lp01 1 Unlink 7967265 50000 159 6289
ua 1 gfs-p8-02-lp02 1 Create 8587264 50000 171 5847
ua 1 gfs-p8-02-lp02 1 Unlink 6588287 50000 131 7633
ua 1 gfs-p8-02-lp03 1 Create 9240521 50000 184 5434
ua 1 gfs-p8-02-lp03 1 Unlink 7163597 50000 143 6993
ua 1 gfs-p8-02-lp04 1 Create 10027705 50000 200 5000
ua 1 gfs-p8-02-lp05 1 Create 9880836 50000 197 5076
ua 1 gfs-p8-02-lp04 1 Unlink 10486450 50000 209 4784
ua 1 gfs-p8-02-lp05 1 Unlink 4892826 50000 97 10309
ua 2 gfs-p8-02-lp01 1 Create 8630539 50000 172 5813
ua 2 gfs-p8-02-lp01 1 Unlink 8831908 50000 176 5681
ua 2 gfs-p8-02-lp02 1 Create 10256548 50000 205 4878
ua 2 gfs-p8-02-lp02 1 Unlink 10888563 50000 217 4608
ua 2 gfs-p8-02-lp03 1 Create 11937661 50000 238 4201
ua 2 gfs-p8-02-lp03 1 Unlink 6911705 50000 138 7246
ua 2 gfs-p8-02-lp04 1 Create 10022115 50000 200 5000
ua 2 gfs-p8-02-lp04 1 Unlink 7631607 50000 152 6578
ua 2 gfs-p8-02-lp05 1 Create 10863615 50000 217 4608
ua 2 gfs-p8-02-lp05 1 Unlink 4361014 50000 87 11494
ub 1 gfs-p8-02-lp01 1 Create 8520686 50000 170 5882
ub 1 gfs-p8-02-lp02 1 Create 8639694 50000 172 5813
ub 1 gfs-p8-02-lp01 1 Unlink 8305460 50000 166 6024
ub 1 gfs-p8-02-lp03 1 Create 10962108 50000 219 4566
ub 1 gfs-p8-02-lp02 1 Unlink 4140089 50000 82 12195
ub 1 gfs-p8-02-lp03 1 Unlink 7177547 50000 143 6993
ub 1 gfs-p8-02-lp04 1 Create 9129388 50000 182 5494
ub 1 gfs-p8-02-lp05 1 Create 10012873 50000 200 5000
ub 1 gfs-p8-02-lp04 1 Unlink 11063866 50000 221 4524
ub 1 gfs-p8-02-lp05 1 Unlink 5131491 50000 102 9803
ub 2 gfs-p8-02-lp01 1 Create 8058275 50000 161 6211
ub 2 gfs-p8-02-lp02 1 Create 8372573 50000 167 5988
ub 2 gfs-p8-02-lp01 1 Unlink 10874232 50000 217 4608
ub 2 gfs-p8-02-lp03 1 Create 11012512 50000 220 4545
ub 2 gfs-p8-02-lp03 1 Unlink 6883298 50000 137 7299
ub 2 gfs-p8-02-lp04 1 Create 9706401 50000 194 5154
ub 2 gfs-p8-02-lp02 1 Unlink 11311964 50000 226 4424
ub 2 gfs-p8-02-lp05 1 Create 10024465 50000 200 5000
ub 2 gfs-p8-02-lp04 1 Unlink 10997509 50000 219 4566
ub 2 gfs-p8-02-lp05 1 Unlink 5953213 50000 119 8403
uc 1 gfs-p8-02-lp01 1 Create 7511806 50000 150 6666
uc 1 gfs-p8-02-lp01 1 Unlink 10492779 50000 209 4784
uc 1 gfs-p8-02-lp02 1 Create 8351852 50000 167 5988
uc 1 gfs-p8-02-lp03 1 Create 9135910 50000 182 5494
uc 1 gfs-p8-02-lp03 1 Unlink 6935930 50000 138 7246
uc 1 gfs-p8-02-lp02 1 Unlink 8630732 50000 172 5813
uc 1 gfs-p8-02-lp04 1 Create 9593579 50000 191 5235
uc 1 gfs-p8-02-lp04 1 Unlink 9115835 50000 182 5494
uc 1 gfs-p8-02-lp05 1 Create 9863805 50000 197 5076
uc 1 gfs-p8-02-lp05 1 Unlink 6088671 50000 121 8264
uc 2 gfs-p8-02-lp01 1 Create 7743901 50000 154 6493
uc 2 gfs-p8-02-lp01 1 Unlink 6723795 50000 134 7462
uc 2 gfs-p8-02-lp02 1 Create 8437675 50000 168 5952
uc 2 gfs-p8-02-lp03 1 Create 8975957 50000 179 5586
uc 2 gfs-p8-02-lp03 1 Unlink 6954982 50000 139 7194
uc 2 gfs-p8-02-lp02 1 Unlink 9213670 50000 184 5434
uc 2 gfs-p8-02-lp04 1 Create 9770116 50000 195 5128
uc 2 gfs-p8-02-lp04 1 Unlink 7973829 50000 159 6289
uc 2 gfs-p8-02-lp05 1 Create 9488781 50000 189 5291
uc 2 gfs-p8-02-lp05 1 Unlink 5811170 50000 116 8620
Done.
What does each column mean in the output?
- Column 1 (Test): This designates how the test was run by the nodes.
- se: This test was done serially, each node did it one at a time.
- pa: This test was done in parallel on all five nodes simultaneously.
- ua: Unlink test a where one node creates files while another node unlinks files.
- ub: Unlink test b where each node creates files, then all nodes unlink files simultaneously.
- uc: Unlink test c where each node creates a file, unlinks a file, creates a file, unlinks a file, serially.
- Column 2 (It): The iteration number of the tests. Often the tests are run twice (or more) to reduce bias caused by first-run caching by the storage array.
- Column 3 (node name): This indicates which node in the cluster produced which results.
- Column 4 (Work): This indicates which worker thread reported the results. The test may run several worker threads at the same time in order to test concurrency of results.
- Column 5 (Test Type): This is the type of test that was run
- Create: File creates were tested.
- Write: File writes were tested.
- Read: File reads were tested.
- Unlink: File unlinks were tested.
- Column 6 (Microsecs ): The amount of time it took to run the test, in microseconds.
- **Column 7 (Files):**The number of files created, written, read, or unlinked for the test.
- Column 8 (us/fil): The average number of microseconds it took to perform the create, write, etc. per file.
- Column 9 (files/sec): This is the average number files that were created (calculated throughput), written, read back, or unlinked, per second during the test.
Collecting supplemental data
It is safe to collect supplemental data that is collected with the following tools: [`glocktop`](/articles/666533) and [`ha-resourcemon.sh`](/solutions/368393). In some situations more data is needed in order to figure what was occurring when the benchmark test program `create_unlink_perf` was running.
If you need Red Hat to review the results, copy and paste the output (including command) to the ticket.
Known Issues
Is create_unlink_perf running?
If the command fails to run make sure there are no running processes for `create_unlink_perf` on any cluster node. In some instances the following messages is printed:
Bind: Address already in use
If there are running processes then kill them before starting a new test.
# for hostname in rhel7-{1..3}.examplerh.com; do ssh $hostname "echo $hostname; ps aux | grep create_unlink_perf | grep -v "grep""; echo; done
# for hostname in rhel7-{1..3}.examplerh.com; do ssh $hostname "killall create_unlink_perf"; done
Is the network port used by create_unlink_perf opened on all cluster nodes?
If the following error message occurs then open the port `62001/tcp` on the cluster node:
Unable to connect to node rhel7-2.examplerh.com,worker 1
connect: No route to host
Related Articles
- Is my gfs2 slowdown a file system problem or a storage problem?
- My gfs2 filesystem is slow. How can I diagnose and make it faster?
- What are some examples of gfs & gfs2 workloads that should be avoided?