Openstack Director Node Performance Tuning for large deployments

Updated

Table of content

Environment

This guide has been written with Red Hat OpenStack Platform 10-13 in mind. While specific tuning steps may work with other versions of Red Hat OpenStack Platform, make sure to test these in a development environment first and/or contact Red Hat Technical Support.

Deployments of enviroments of RH OSP 15+ the undercloud node has containerized all services, most of the tuning can still be applied. However, the commands may need to be ran within the services container podman exec -ti $container $command and strongly recommending to configure the settings via undercloud install for proper puppet configurations.

Introduction

Red Hat OpenStack Platform Director's performance can decrease significantly when deploying overclouds with a relatively high number of nodes. This can hamper any operations such as deployment, scaling, or upgrading the overcloud.

Performance tuning is not a straight forward task. In specific setups, tuning may actually decrease performance, for example when specific bugs in firmware or kernel modules are involuntarily triggered by the tuning steps. This article highlights what can be done to help when hitting performance issues. However, it is important to test and benchmark the environment before and after applying tuning steps.

Do note that RH Openstack is an evolving platform and some configuration defaults may get changed over time. Some of the items in this guide are more exploratory to show what can/should be toyed with.

The following list for undercloud tuning is by no means exhaustive.

Limits

As per this article, Red Hat has tested deployments up to 300 nodes. Do note this number was for RH OSP 10, new scaling data for RH OSP 16 has shown deployments of over 1000 nodes. The tweaks and tunables in this article can certainly help deploy more than 300 nodes, however depending on environment configuration requirements you should open a case with support.

The minimum settings for the undercloud as stated in https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html-single/director_installation_and_usage/#sect-Undercloud_Requirements are unfortunately too low for most production deployments.

For large scale deployments of more than 50 overcloud nodes, aim for at least the following hardware configuration:

  • 128 GB of RAM
  • 24 cores of vCPUs
  • SSD storage as a database backend
  • Use a 10G NIC for the provisioning interface

Director node hosted as a KVM Virtual Machine

  • Provide at least 24 logical cores to the VM and 128 GB of RAM.
  • Use of isolcpus/cpu pinning, to isolate this guest's CPUs, with considerations to NUMA nodes.
  • PCI passthrough of network interfaces (or use of something like SR-IOV) to increase networking performace by a fair margin
  • Enabling hugepages for the guest and disabling THP
  • Removing unused devices, like sound and usb. (Not a huge impact, but it is from our performance tunning documentation)

Monitoring

You may want to install SAR and configure the cronjob to run at a 1 minute interval: https://access.redhat.com/solutions/276533

This will allow you to verify and compare your tuning settings.

Making settings persistent

  1. Modify undercloud.conf and follow https://access.redhat.com/solutions/3135361 to make settings persistent across undercloud upgrades.

  2. Create a hiera override file with the hiera settings that should be persistent:

    touch /home/stack/custom_hiera.yaml
    
  3. Configure the hieradata_override parameter in undercloud.conf.

    sudo yum install crudini -y
    crudini --set /home/stack/undercloud.conf DEFAULT hieradata_override /home/stack/custom_hiera.yaml
    
  4. Execute:

    openstack undercloud install
    

Note: Openstack undercloud install will revert any manual changes that overlap with its configurations.

General advice

  • Streamline the templates by removing any unused parameters
  • Remove any unused nodes and roles
  • When using SSL on the undercloud, make sure to have enough entropy available. VMs run out of entropy quite easily, and Director underclouds are often deployed as Virtual Machines. Check /proc/sys/kernel/random/entropy_avail and install haveged or rng tools to have entropy if needed. Be aware that haveged and similar tools are just "good" entropy, not "real" or "perfect" entropy.
  • Verify if different tuned profiles have an impact. If Director is a Virtual Machine, latency performance and throughput-performance may yield better results than the default virtual-guest.

Database tuning and cleanup

One-time cleanup

  • Clean up the MySQL database as much as possible purging tokens and deleted entries. In order to do so, verify table sizes with the following commands:

    sudo mysql -e 'SELECT table_schema "DB Name",
    ROUND(SUM(data_length + index_length) / 1024 / 1024, 1) "DB Size in MB"
    FROM information_schema.tables
    GROUP BY table_schema; '
    
    sudo mysql -e 'SELECT
    table_schema as `Database`,
    table_name AS `Table`,
    round(((data_length + index_length) / 1024 / 1024), 2) `Size in MB`
    FROM information_schema.TABLES
    ORDER BY (data_length + index_length) DESC LIMIT 0,20;'
    
  • If needed drop and recreate the ceilometer database:

    mysql -e 'DROP DATABASE ceilometer';
    mysql -e 'CREATE DATABASE ceilometer'
    mysql -e "GRANT ALL PRIVILEGES ON ceilometer.* TO 'ceilometer'@'localhost';"
    ceilometer-dbsync
    
  • If needed, truncate the keystone.token table. This is not required when using keystone fernet configuration:

    mysql keystone -e 'TRUNCATE token'
    

Tweaking cronjobs for database cleanups

  • Keystone

    • Note: This is not required when using keystone fernet configuration.

    • Every hour a cron job will run which calls keystone-manage token_flush and logs to /var/log/keystone/keystone.log (grep "tokens removed" /var/log/keystone/keystone.log):

      [root@undercloud-7 ~]# cat /var/spool/cron/keystone | tail -1
      1 * * * * keystone-manage token_flush >>/dev/null 2>&1
      
    • Make sure that this cronjob is executed and possibly increase the frequency at which it runs

    • Normally, flushing the token every hour is safe and recommended but this can depend on workload. One can monitor the current size of the token table with the following query to:

      [stack@undercloud-0 ~]$ sudo mysql -D keystone -e "select count(*) from token;"
      +----------+
      | count(*) |
      +----------+
      |       14 |
      +----------+
      
    • Making persistent changes to the keystone cronjob. E.g., to run the keystone cronjob every 5 minutes:

      echo "keystone::cron::token_flush::minute: '*/5'" >> /home/stack/custom_hiera.yaml
      
    • For more details, refer to file /etc/puppet/modules/keystone/manifests/cron/token_flush.pp

  • Heat

    • Every day a cron job will run heat-manage purge_deleted:

      [root@undercloud-7 ~]# cat /var/spool/cron/heat | tail -1
      1 0 * * * heat-manage purge_deleted -g days 1 >>/dev/null 2>&1
      
    • For heat, purging the deleted rows can be done once per day. If the environment is a lab which is constantly deleting the overcloud stack and redeploying, one might consider to do this more often. One can validate with this mysql query:

      [stack@undercloud-0 ~]$ for t in stack resource;do echo $t;sudo mysql -D heat -e "select status,count(*) from $t group by status;"; done 
      stack
      +----------+----------+
      | status   | count(*) |
      +----------+----------+
      | COMPLETE |      340 |
      +----------+----------+
      resource
      +----------+----------+
      | status   | count(*) |
      +----------+----------+
      | COMPLETE |      577 |
      +----------+----------+
      
    • Make sure that heat templates are purged efficiently

Database storage backend tweaks

Move MysqlDB and mongodb to SSD volumes if possible. In case that MariaDB needs temp tables, ensure it writes the temp tables to a memory based filesystem. Otherwise, in specific cases, even if the database is moved to fast SSDs drives, whenever a temp table is created, it may be created on HDD. Hence move MariaDB's tmpdir to a fast disk or tmpfs folder and tweak SElinux config if necessary.

MariaDB settings tuning

Update the following parameters in the mysqld stanza in /etc/my.cnf.d/galera.cnf.

These settings can be made persistent across undercloud upgrades like this.

  • Increase max_connections to 8192 (from 4096)

    [mysqld]
    max_connections = 8192
    
    • Note: Minimal memory impact, may need to watch how many open files it will hit. May need to increase that limit too.
    • Note: If this max_connections is set to -1, it will in fact be set to 100 000. It might be a good idea to not go that high. The default is 4096, but it is pretty safe to set it to a value around 8192 or 10240 for example.
  • Increase innodb_buffer_pool_size to 2G (from 128M)

    [mysqld]
    innodb_buffer_pool_size = 2G
    
  • Increase innodb_buffer_pool_instances to 4 (from 1)

    [mysqld]
    innodb_buffer_pool_instances = 4
    
  • Increase tmp_table_size and max_heap_table_size to 128M (from 16M)

    [mysqld]
    tmp_table_size = 128M
    max_heap_table_size = 128M
    
  • Increase max_allowed_packet to 64M (from 16M)

    [mysqld]
    max_allowed_packet = 64M
    
  • Increase connect_timeout to 60 (from 10)

    [mysqld]
    connect_timeout = 60
    
  • Restart MariaDB after making these changes:

    systemctl restart mariadb
    
  • Verification:

    mysql -e 'show variables where variable_name rlike "(^max_[ac]|r_pool_[si]|^tmp_table|^connect)";'
    
    

MariaDB settings persistency

  • Make settings persist. Get the contents of map tripleo::profile::base::database::mysql::mysql_server_options from /etc/puppet/hieradata/puppet-stack-config.yaml:

    (undercloud) [stack@undercloud-1 ~]$ sudo grep -A3 tripleo::profile::base::database::mysql::mysql_server_options /etc/puppet/hieradata/puppet-stack-config.yaml
    tripleo::profile::base::database::mysql::mysql_server_options:
      'mysqld':
        bind-address: "%{hiera('controller_host')}"
        innodb_file_per_table: 'ON'
    
  • These contents need to be merged with the following custom settings:

    echo 'mysql_max_connections: 8192' >>/home/stack/custom_hiera.yaml
    cat <<'EOF'>>/home/stack/custom_hiera.yaml
    tripleo::profile::base::database::mysql::mysql_server_options:
      'mysqld':
        innodb_buffer_pool_instances: 4
        innodb_buffer_pool_size: '2G'
        tmp_table_size: '128M'
        bind-address: "%{hiera('controller_host')}"                   # from puppet-stack-config.yaml
        innodb_file_per_table: 'ON'                                   # from puppet-stack-config.yaml
        max_allowed_packet: '64M'
        connect_timeout: '60'
    EOF
    

OpenStack services tuning

Tweaking deployment timeouts

  • If you hit timeouts during the deployment and other tuning measures do not help, bump the rpc_response_timeout from 600 to 1200 seconds in the heat.conf:

    cp /etc/heat/heat.conf{,.back}
    crudini --set /etc/heat/heat.conf DEFAULT rpc_response_timeout 1200
    systemctl restart openstack-heat*
    
  • Make sure that the following is set for nova and ironic:

    cp /etc/nova/nova.conf{,.back}
    cp /etc/ironic/ironic.conf{,.back}
    crudini --set /etc/nova/nova.conf DEFAULT rpc_response_timeout 600
    crudini --set /etc/nova/nova.conf DEFAULT max_concurrent_builds 4
    crudini --set /etc/ironic/ironic.conf DEFAULT rpc_response_timeout 600
    crudini --set /etc/ironic/ironic.conf DEFAULT rpc_thread_pool_size 8 
    systemctl restart openstack-nova* openstack-ironic*
    
  • Making settings persist:

    echo 'nova::rpc_response_timeout: 600' >>/home/stack/custom_hiera.yaml
    echo 'nova::compute::ironic::max_concurrent_builds: 4' >> /home/stack/custom_hiera.yaml
    echo 'ironic::rpc_response_timeout: 600' >>/home/stack/custom_hiera.yaml
    echo 'heat::rpc_response_timeout: 1200' >>/home/stack/custom_hiera.yaml
    
  • In order to make ironic's rpc_thread_pool_size persistent (or in newer versions executor_thread_pool_size):

    • This This content is not included.bugzilla is blocking the feature for OpenStack Platform 10

    • For OpenStack Platform 13 and above, we can have this in custom_hiera.yaml:

      ironic::config::ironic_config:
          DEFAULT/executor_thread_pool_size:
              value: 20
      
  • In cases where you decide to bump the overcloud deploy command timeout to greater than 240 minutes you would also need to update keystone token expiration timeout in keystone.conf on the undercloud to an equivalent value in seconds (expiration = 14400 seconds by default).

wsgi process/thread tuning

Tuning heat

  • Set the number of heat engine workers to the number of half the SMT CPUs:

    crudini --set /etc/heat/heat.conf DEFAULT num_engine_workers <number of half the SMT CPUs>
    
  • Making settings persist:

    echo 'heat::engine::num_engine_workers: 24' >>/home/stack/custom_hiera.yaml
    
    • Note: In more recent versions of OSP, there is a complex set of settings depending on the actual usecase and service and no change should be needed. The most current formula can be found in /etc/puppet/modules/openstacklib/lib/facter/os_workers.rb. The formulas are updated to match the community's and/or Red Hat's experience with settings for best performance.
  • E.g., as of this writing in July 2019, the setting for OSP 10 is:

    cat /etc/puppet/modules/openstacklib/lib/facter/os_workers.rb
    #
    # We've found that using $::processorcount for workers/threads can lead to
    # unexpected memory or process counts for people deploying on baremetal or
    # if they have large number of cpus. This fact allows us to tweak the formula
    # used to determine number of workers in a single place but use it across all
    # modules.
    #
    # The value for os_workers is max between '(<# processors> / 4)' and '2' with
    # a cap of 8.
    #
    # This fact can be overloaded by an external fact from /etc/factor/facts.d if
    # a user would like to provide their own default value.
    #
    Facter.add(:os_workers) do
      has_weight 100
      setcode do
        processors = Facter.value('processorcount')
        [ [ (processors.to_i / 2), 2 ].max, 12 ].min
      end
    end
    
  • And the setting for OSP 13 is a bit more complex, with specific tunings for heat:

    cat /etc/puppet/modules/openstacklib/lib/facter/os_workers.rb
    #
    # We've found that using $::processorcount for workers/threads can lead to
    # unexpected memory or process counts for people deploying on baremetal or
    # if they have large number of cpus. This fact allows us to tweak the formula
    # used to determine number of workers in a single place but use it across all
    # modules.
    #
    # The value for os_workers is max between '(<# processors> / 4)' and '2' with
    # a cap of 8.
    #
    # This fact can be overloaded by an external fact from /etc/factor/facts.d if
    # a user would like to provide their own default value.
    #
    Facter.add(:os_workers_small) do
      has_weight 100
      setcode do
        processors = Facter.value('processorcount')
        [ [ (processors.to_i / 4), 2 ].max, 8 ].min
      end
    end
    
    #
    # The value above for os_workers performs 3x worse in many cases compared to
    # the prevuous default of $::processorcount.
    #
    # Based on performance data [1], the following calculation is within 1-2%.
    #
    # The value for os_workers is max between '(<# processors> / 2)' and '2' with
    # a cap of 12.
    #
    # [1] http://elk.browbeatproject.org:80/goto/a23307fd511e314b975dedca6f65425d
    #
    Facter.add(:os_workers) do
      has_weight 100
      setcode do
        processors = Facter.value('processorcount')
        [ [ (processors.to_i / 2), 2 ].max, 12 ].min
      end
    end
    
    #
    # For cases where services are not co-located together (ie monolithic).
    #
    Facter.add(:os_workers_large) do
      has_weight 100
      setcode do
        processors = Facter.value('processorcount')
        [ (processors.to_i / 2), 1 ].max
      end
    end
    
    #
    # Heat Engine service can be more stressed than other services, so
    # a minimum of 4 and maximum of 24 workers should be fine, still
    # calculating with the number of processors.
    #
    Facter.add(:os_workers_heat_engine) do
      has_weight 100
      setcode do
        processors = Facter.value('processorcount')
        [ [ (processors.to_i / 2), 4 ].max, 24 ].min
      end
    end
    
  • Update yaql settings in /etc/heat/heat.conf with:

    crudini --set /etc/heat/heat.conf yaql memory_quota=100000
    crudini --set /etc/heat/heat.conf yaql limit_iterators 10000
    
  • Making settings persist:

    echo 'heat::yaql_memory_quota: 100000' >>/home/stack/custom_hiera.yaml
    echo 'heat::yaql_limit_iterators: 10000' >>/home/stack/custom_hiera.yaml
    

memcached tuning

WARNING: There's a known Content from bugs.launchpad.net is not included.bug with heat and caching, when creating or recreating multiple resources with the same name. This shouldn't impact the overcloud deployment, but it's good to know it's there. This is more a concern with overcloud stacks than undercloud. Undercloud stacks should have unique resource names and aren't constantly deleted like we see sometimes on the overcloud. This can be a concern if you're in a lab environment with constant deploys and deletions.

  • Enable memcached in /etc/heat/heat.conf with:

    crudini --set /etc/heat/heat.conf cache backend dogpile.cache.memcached
    crudini --set /etc/heat/heat.conf cache enabled true
    crudini --set /etc/heat/heat.conf cache memcache_servers 127.0.0.1:11211
    
    • Note: you need memcached process running and enabled memcached_servers=$IP set to the correct IP [cache]

    • memcached should already be running:

      [root@undercloud-7 ~]# ps aux | grep [m]emc
      memcach+  5019  0.0  0.0 656224  5556 ?        Ssl  Feb08   2:20 /usr/bin/memcached -p 11211 -u memcached -m 15087 -c 8192 -l 127.0.0.1 -U 0 -t 8 >> /var/log/memcached.log 2>&1
      
  • Enabling memcached can improve the oslo_messaging cache and part of the heat perfomance This content is not included.Bug 1394920 - Timeout error when heat stack-update is used for big cluster.

  • Making these settings persistent:

    • Note: There currently is no easy way to make these settings persistent. It may be possible to use the heat_config resource in puppet to achieve this.

    • This article describes how to enable memcached caching on all the Director services, most importantly the token caching. Enabling token caching can improve deployment time and help if you have a lot of small postconfig software deployments. Every time a node is signaling back a deployment, there's a token validation happening and Red Hat has seen issues on large clouds (200+ nodes) when token caching was not enabled.

Disable Telemetry

Other configuration options. If not using CFME, there is no need to run Telemetry on your undercloud.

Temporarily

  • Run these commands on the director:

    systemctl stop openstack-ceilometer*
    systemctl stop openstack-aodh*
    systemctl stop openstack-gnocchi*
    systemctl list-unit-files | egrep 'openstack-ceilometer|openstack-aodh|openstack-gnocchi' | grep enabled | awk '{print $1}' | xargs  systemctl disable
    

Disabling telemetry permanently

  • Set this in undercloud.conf:

    ## undercloud.conf
    enable_telemetry = false
    
  • This can be conveniently done with crudini:

    crudini --set /home/stack/undercloud.conf DEFAULT enable_telemetry false
    
SBR
Category
Components
Article Type