Openstack Director Node Performance Tuning for large deployments

Updated 24 Nov 2020

Table of content

Environment

This guide has been written with Red Hat OpenStack Platform 10-13 in mind. While specific tuning steps may work with other versions of Red Hat OpenStack Platform, make sure to test these in a development environment first and/or contact Red Hat Technical Support.

Deployments of enviroments of RH OSP 15+ the undercloud node has containerized all services, most of the tuning can still be applied. However, the commands may need to be ran within the services container podman exec -ti $container $command and strongly recommending to configure the settings via undercloud install for proper puppet configurations.

Introduction

Red Hat OpenStack Platform Director's performance can decrease significantly when deploying overclouds with a relatively high number of nodes. This can hamper any operations such as deployment, scaling, or upgrading the overcloud.

Performance tuning is not a straight forward task. In specific setups, tuning may actually decrease performance, for example when specific bugs in firmware or kernel modules are involuntarily triggered by the tuning steps. This article highlights what can be done to help when hitting performance issues. However, it is important to test and benchmark the environment before and after applying tuning steps.

Do note that RH Openstack is an evolving platform and some configuration defaults may get changed over time. Some of the items in this guide are more exploratory to show what can/should be toyed with.

The following list for undercloud tuning is by no means exhaustive.

Limits

As per this article, Red Hat has tested deployments up to 300 nodes. Do note this number was for RH OSP 10, new scaling data for RH OSP 16 has shown deployments of over 1000 nodes. The tweaks and tunables in this article can certainly help deploy more than 300 nodes, however depending on environment configuration requirements you should open a case with support.

Recommended undercloud hardware

The minimum settings for the undercloud as stated in https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html-single/director_installation_and_usage/#sect-Undercloud_Requirements are unfortunately too low for most production deployments.

For large scale deployments of more than 50 overcloud nodes, aim for at least the following hardware configuration:

128 GB of RAM
24 cores of vCPUs
SSD storage as a database backend
Use a 10G NIC for the provisioning interface

Director node hosted as a KVM Virtual Machine

Provide at least 24 logical cores to the VM and 128 GB of RAM.
Use of isolcpus/cpu pinning, to isolate this guest's CPUs, with considerations to NUMA nodes.
PCI passthrough of network interfaces (or use of something like SR-IOV) to increase networking performace by a fair margin
Enabling hugepages for the guest and disabling THP
Removing unused devices, like sound and usb. (Not a huge impact, but it is from our performance tunning This content is not included.documentation)

Monitoring

You may want to install SAR and configure the cronjob to run at a 1 minute interval: https://access.redhat.com/solutions/276533

This will allow you to verify and compare your tuning settings.

Making settings persistent

Modify undercloud.conf and follow https://access.redhat.com/solutions/3135361 to make settings persistent across undercloud upgrades.
Create a hiera override file with the hiera settings that should be persistent:
```
touch /home/stack/custom_hiera.yaml
```

Configure the hieradata_override parameter in undercloud.conf.

sudo yum install crudini -y
crudini --set /home/stack/undercloud.conf DEFAULT hieradata_override /home/stack/custom_hiera.yaml

Execute:
```
openstack undercloud install
```

Note: Openstack undercloud install will revert any manual changes that overlap with its configurations.

General advice

Streamline the templates by removing any unused parameters
Remove any unused nodes and roles
When using SSL on the undercloud, make sure to have enough entropy available. VMs run out of entropy quite easily, and Director underclouds are often deployed as Virtual Machines. Check /proc/sys/kernel/random/entropy_avail and install haveged or rng tools to have entropy if needed. Be aware that haveged and similar tools are just "good" entropy, not "real" or "perfect" entropy.
Verify if different tuned profiles have an impact. If Director is a Virtual Machine, latency performance and throughput-performance may yield better results than the default virtual-guest.

Database tuning and cleanup

One-time cleanup

Clean up the MySQL database as much as possible purging tokens and deleted entries. In order to do so, verify table sizes with the following commands:

sudo mysql -e 'SELECT table_schema "DB Name",
ROUND(SUM(data_length + index_length) / 1024 / 1024, 1) "DB Size in MB"
FROM information_schema.tables
GROUP BY table_schema; '

sudo mysql -e 'SELECT
table_schema as `Database`,
table_name AS `Table`,
round(((data_length + index_length) / 1024 / 1024), 2) `Size in MB`
FROM information_schema.TABLES
ORDER BY (data_length + index_length) DESC LIMIT 0,20;'

If needed drop and recreate the ceilometer database:

mysql -e 'DROP DATABASE ceilometer';
mysql -e 'CREATE DATABASE ceilometer'
mysql -e "GRANT ALL PRIVILEGES ON ceilometer.* TO 'ceilometer'@'localhost';"
ceilometer-dbsync

If needed, truncate the keystone.token table. This is not required when using keystone fernet configuration:
```
mysql keystone -e 'TRUNCATE token'
```

Tweaking cronjobs for database cleanups

Keystone
- Note: This is not required when using keystone fernet configuration.
- Every hour a cron job will run which calls keystone-manage token_flush and logs to /var/log/keystone/keystone.log (grep "tokens removed" /var/log/keystone/keystone.log):
```
[root@undercloud-7 ~]# cat /var/spool/cron/keystone | tail -1
1 * * * * keystone-manage token_flush >>/dev/null 2>&1
```
- Make sure that this cronjob is executed and possibly increase the frequency at which it runs
- Normally, flushing the token every hour is safe and recommended but this can depend on workload. One can monitor the current size of the token table with the following query to:
```
[stack@undercloud-0 ~]$ sudo mysql -D keystone -e "select count(*) from token;"
+----------+
| count(*) |
+----------+
|       14 |
+----------+
```
- Making persistent changes to the keystone cronjob. E.g., to run the keystone cronjob every 5 minutes:
```
echo "keystone::cron::token_flush::minute: '*/5'" >> /home/stack/custom_hiera.yaml
```
- For more details, refer to file /etc/puppet/modules/keystone/manifests/cron/token_flush.pp

Heat

Every day a cron job will run heat-manage purge_deleted:

[root@undercloud-7 ~]# cat /var/spool/cron/heat | tail -1
1 0 * * * heat-manage purge_deleted -g days 1 >>/dev/null 2>&1

For heat, purging the deleted rows can be done once per day. If the environment is a lab which is constantly deleting the overcloud stack and redeploying, one might consider to do this more often. One can validate with this mysql query:

[stack@undercloud-0 ~]$ for t in stack resource;do echo $t;sudo mysql -D heat -e "select status,count(*) from $t group by status;"; done 
stack
+----------+----------+
| status   | count(*) |
+----------+----------+
| COMPLETE |      340 |
+----------+----------+
resource
+----------+----------+
| status   | count(*) |
+----------+----------+
| COMPLETE |      577 |
+----------+----------+

Make sure that heat templates are purged efficiently

Database storage backend tweaks

Move MysqlDB and mongodb to SSD volumes if possible. In case that MariaDB needs temp tables, ensure it writes the temp tables to a memory based filesystem. Otherwise, in specific cases, even if the database is moved to fast SSDs drives, whenever a temp table is created, it may be created on HDD. Hence move MariaDB's tmpdir to a fast disk or tmpfs folder and tweak SElinux config if necessary.

MariaDB settings tuning

Update the following parameters in the mysqld stanza in /etc/my.cnf.d/galera.cnf.

These settings can be made persistent across undercloud upgrades like this.

Increase max_connections to 8192 (from 4096)
```
[mysqld]
max_connections = 8192
```
- Note: Minimal memory impact, may need to watch how many open files it will hit. May need to increase that limit too.
- Note: If this max_connections is set to -1, it will in fact be set to 100 000. It might be a good idea to not go that high. The default is 4096, but it is pretty safe to set it to a value around 8192 or 10240 for example.
Increase innodb_buffer_pool_size to 2G (from 128M)
```
[mysqld]
innodb_buffer_pool_size = 2G
```
Increase innodb_buffer_pool_instances to 4 (from 1)
```
[mysqld]
innodb_buffer_pool_instances = 4
```
Increase tmp_table_size and max_heap_table_size to 128M (from 16M)
```
[mysqld]
tmp_table_size = 128M
max_heap_table_size = 128M
```
Increase max_allowed_packet to 64M (from 16M)
```
[mysqld]
max_allowed_packet = 64M
```
Increase connect_timeout to 60 (from 10)
```
[mysqld]
connect_timeout = 60
```
Restart MariaDB after making these changes:
```
systemctl restart mariadb
```

Verification:

mysql -e 'show variables where variable_name rlike "(^max_[ac]|r_pool_[si]|^tmp_table|^connect)";'

MariaDB settings persistency

Make settings persist. Get the contents of map tripleo::profile::base::database::mysql::mysql_server_options from /etc/puppet/hieradata/puppet-stack-config.yaml:

(undercloud) [stack@undercloud-1 ~]$ sudo grep -A3 tripleo::profile::base::database::mysql::mysql_server_options /etc/puppet/hieradata/puppet-stack-config.yaml
tripleo::profile::base::database::mysql::mysql_server_options:
  'mysqld':
    bind-address: "%{hiera('controller_host')}"
    innodb_file_per_table: 'ON'

These contents need to be merged with the following custom settings:

echo 'mysql_max_connections: 8192' >>/home/stack/custom_hiera.yaml
cat <<'EOF'>>/home/stack/custom_hiera.yaml
tripleo::profile::base::database::mysql::mysql_server_options:
  'mysqld':
    innodb_buffer_pool_instances: 4
    innodb_buffer_pool_size: '2G'
    tmp_table_size: '128M'
    bind-address: "%{hiera('controller_host')}"                   # from puppet-stack-config.yaml
    innodb_file_per_table: 'ON'                                   # from puppet-stack-config.yaml
    max_allowed_packet: '64M'
    connect_timeout: '60'
EOF

OpenStack services tuning

Tweaking deployment timeouts

If you hit timeouts during the deployment and other tuning measures do not help, bump the rpc_response_timeout from 600 to 1200 seconds in the heat.conf:
```
cp /etc/heat/heat.conf{,.back}
crudini --set /etc/heat/heat.conf DEFAULT rpc_response_timeout 1200
systemctl restart openstack-heat*
```

Make sure that the following is set for nova and ironic:

cp /etc/nova/nova.conf{,.back}
cp /etc/ironic/ironic.conf{,.back}
crudini --set /etc/nova/nova.conf DEFAULT rpc_response_timeout 600
crudini --set /etc/nova/nova.conf DEFAULT max_concurrent_builds 4
crudini --set /etc/ironic/ironic.conf DEFAULT rpc_response_timeout 600
crudini --set /etc/ironic/ironic.conf DEFAULT rpc_thread_pool_size 8 
systemctl restart openstack-nova* openstack-ironic*

Making settings persist:

echo 'nova::rpc_response_timeout: 600' >>/home/stack/custom_hiera.yaml
echo 'nova::compute::ironic::max_concurrent_builds: 4' >> /home/stack/custom_hiera.yaml
echo 'ironic::rpc_response_timeout: 600' >>/home/stack/custom_hiera.yaml
echo 'heat::rpc_response_timeout: 1200' >>/home/stack/custom_hiera.yaml

In order to make ironic's rpc_thread_pool_size persistent (or in newer versions executor_thread_pool_size):
- This This content is not included.bugzilla is blocking the feature for OpenStack Platform 10
- For OpenStack Platform 13 and above, we can have this in custom_hiera.yaml:
```
ironic::config::ironic_config:
    DEFAULT/executor_thread_pool_size:
        value: 20
```
In cases where you decide to bump the overcloud deploy command timeout to greater than 240 minutes you would also need to update keystone token expiration timeout in keystone.conf on the undercloud to an equivalent value in seconds (expiration = 14400 seconds by default).

wsgi process/thread tuning

Note: Due to Python's Global Interpreter Lock (GIL), one cannot use multiple threads for running OpenStack services without a performance penalty, since the execution ends up serialized, which defeats the purpose. Instead, one should use several processes, since this approach doesn't have this limitation

Use 16 processes with 1 thread each for keystone-admin in /etc/httpd/conf.d/10-keystone_wsgi_admin.conf and 4 processes with 1 thread each in /etc/httpd/conf.d/10-keystone_wsgi_main.conf:

# grep WSGIDaemonProcess /etc/httpd/conf.d/10-keystone_wsgi_admin.conf 
  WSGIDaemonProcess keystone_admin display-name=keystone-admin group=keystone processes=16 threads=1 user=keystone

# grep WSGIDaemonProcess /etc/httpd/conf.d/10-keystone_wsgi_main.conf 
  WSGIDaemonProcess keystone_main display-name=keystone-main group=keystone processes=4 threads=1 user=keystone

Set nova API in WSGI (/etc/httpd/conf.d/10-nova_api_wsgi.conf) to 8 processes with 1 thread each :

# grep WSGIDaemonProcess /etc/httpd/conf.d/10-nova_api_wsgi.conf 
WSGIDaemonProcess nova-api group=nova processes=8 threads=1 user=nova

Restart apache when a change was made to the configuration:
```
systemctl restart httpd
```

Making settings persist:

echo 'nova::wsgi::apache::workers: 8 ' >>/home/stack/custom_hiera.yaml
echo 'nova::wsgi::apache::threads: 1 ' >>/home/stack/custom_hiera.yaml
echo 'keystone::wsgi::apache::workers: 16 ' >>/home/stack/custom_hiera.yaml
echo 'keystone::wsgi::apache::threads: 1 ' >>/home/stack/custom_hiera.yaml

Note: There is no way to set the admin and main WSGI settings to different values at the moment. For more details, see /etc/puppet/modules/keystone/manifests/wsgi/apache.pp.

Tuning heat

Set the number of heat engine workers to the number of half the SMT CPUs:

crudini --set /etc/heat/heat.conf DEFAULT num_engine_workers <number of half the SMT CPUs>

Making settings persist:
```
echo 'heat::engine::num_engine_workers: 24' >>/home/stack/custom_hiera.yaml
```
- Note: In more recent versions of OSP, there is a complex set of settings depending on the actual usecase and service and no change should be needed. The most current formula can be found in /etc/puppet/modules/openstacklib/lib/facter/os_workers.rb. The formulas are updated to match the community's and/or Red Hat's experience with settings for best performance.

E.g., as of this writing in July 2019, the setting for OSP 10 is:

cat /etc/puppet/modules/openstacklib/lib/facter/os_workers.rb
#
# We've found that using $::processorcount for workers/threads can lead to
# unexpected memory or process counts for people deploying on baremetal or
# if they have large number of cpus. This fact allows us to tweak the formula
# used to determine number of workers in a single place but use it across all
# modules.
#
# The value for os_workers is max between '(<# processors> / 4)' and '2' with
# a cap of 8.
#
# This fact can be overloaded by an external fact from /etc/factor/facts.d if
# a user would like to provide their own default value.
#
Facter.add(:os_workers) do
  has_weight 100
  setcode do
    processors = Facter.value('processorcount')
    [ [ (processors.to_i / 2), 2 ].max, 12 ].min
  end
end

And the setting for OSP 13 is a bit more complex, with specific tunings for heat:

cat /etc/puppet/modules/openstacklib/lib/facter/os_workers.rb
#
# We've found that using $::processorcount for workers/threads can lead to
# unexpected memory or process counts for people deploying on baremetal or
# if they have large number of cpus. This fact allows us to tweak the formula
# used to determine number of workers in a single place but use it across all
# modules.
#
# The value for os_workers is max between '(<# processors> / 4)' and '2' with
# a cap of 8.
#
# This fact can be overloaded by an external fact from /etc/factor/facts.d if
# a user would like to provide their own default value.
#
Facter.add(:os_workers_small) do
  has_weight 100
  setcode do
    processors = Facter.value('processorcount')
    [ [ (processors.to_i / 4), 2 ].max, 8 ].min
  end
end

#
# The value above for os_workers performs 3x worse in many cases compared to
# the prevuous default of $::processorcount.
#
# Based on performance data [1], the following calculation is within 1-2%.
#
# The value for os_workers is max between '(<# processors> / 2)' and '2' with
# a cap of 12.
#
# [1] http://elk.browbeatproject.org:80/goto/a23307fd511e314b975dedca6f65425d
#
Facter.add(:os_workers) do
  has_weight 100
  setcode do
    processors = Facter.value('processorcount')
    [ [ (processors.to_i / 2), 2 ].max, 12 ].min
  end
end

#
# For cases where services are not co-located together (ie monolithic).
#
Facter.add(:os_workers_large) do
  has_weight 100
  setcode do
    processors = Facter.value('processorcount')
    [ (processors.to_i / 2), 1 ].max
  end
end

#
# Heat Engine service can be more stressed than other services, so
# a minimum of 4 and maximum of 24 workers should be fine, still
# calculating with the number of processors.
#
Facter.add(:os_workers_heat_engine) do
  has_weight 100
  setcode do
    processors = Facter.value('processorcount')
    [ [ (processors.to_i / 2), 4 ].max, 24 ].min
  end
end

Update yaql settings in /etc/heat/heat.conf with:

crudini --set /etc/heat/heat.conf yaql memory_quota=100000
crudini --set /etc/heat/heat.conf yaql limit_iterators 10000

Making settings persist:

echo 'heat::yaql_memory_quota: 100000' >>/home/stack/custom_hiera.yaml
echo 'heat::yaql_limit_iterators: 10000' >>/home/stack/custom_hiera.yaml

memcached tuning

WARNING: There's a known Content from bugs.launchpad.net is not included.bug with heat and caching, when creating or recreating multiple resources with the same name. This shouldn't impact the overcloud deployment, but it's good to know it's there. This is more a concern with overcloud stacks than undercloud. Undercloud stacks should have unique resource names and aren't constantly deleted like we see sometimes on the overcloud. This can be a concern if you're in a lab environment with constant deploys and deletions.

Enable memcached in /etc/heat/heat.conf with:

crudini --set /etc/heat/heat.conf cache backend dogpile.cache.memcached
crudini --set /etc/heat/heat.conf cache enabled true
crudini --set /etc/heat/heat.conf cache memcache_servers 127.0.0.1:11211

Note: you need memcached process running and enabled memcached_servers=$IP set to the correct IP [cache]

memcached should already be running:

[root@undercloud-7 ~]# ps aux | grep [m]emc
memcach+  5019  0.0  0.0 656224  5556 ?        Ssl  Feb08   2:20 /usr/bin/memcached -p 11211 -u memcached -m 15087 -c 8192 -l 127.0.0.1 -U 0 -t 8 >> /var/log/memcached.log 2>&1

Enabling memcached can improve the oslo_messaging cache and part of the heat perfomance This content is not included.Bug 1394920 - Timeout error when heat stack-update is used for big cluster.
Making these settings persistent:
- Note: There currently is no easy way to make these settings persistent. It may be possible to use the heat_config resource in puppet to achieve this.
- This article describes how to enable memcached caching on all the Director services, most importantly the token caching. Enabling token caching can improve deployment time and help if you have a lot of small postconfig software deployments. Every time a node is signaling back a deployment, there's a token validation happening and Red Hat has seen issues on large clouds (200+ nodes) when token caching was not enabled.

Disable Telemetry

Other configuration options. If not using CFME, there is no need to run Telemetry on your undercloud.

Temporarily

Run these commands on the director:

systemctl stop openstack-ceilometer*
systemctl stop openstack-aodh*
systemctl stop openstack-gnocchi*
systemctl list-unit-files | egrep 'openstack-ceilometer|openstack-aodh|openstack-gnocchi' | grep enabled | awk '{print $1}' | xargs  systemctl disable

Disabling telemetry permanently

Set this in undercloud.conf:

## undercloud.conf
enable_telemetry = false

This can be conveniently done with crudini:

crudini --set /home/stack/undercloud.conf DEFAULT enable_telemetry false

SBR

Stack

Product(s)

Red Hat OpenStack Platform

Category

Performance tune

Components

installer

Tags

Article Type

General