Satellite 6 Remote Agent (katello-agent) Scaling / Performance Tuning

Solution Verified - Updated

Environment

  • Red Hat Satellite 6
    • All options apply to Satellite, those related to qdrouterd also to Capsule
  • Red Hat Enterprise Linux (RHEL) 6 or 7

Issue

If the user has more than 225 remote systems connecting with katello-agent to the Satellite or Capsule servers. The user may have errors trying to

  • initiate a package install from the Satellite 6 User Interface
  • initiate a package install from the Satellite 6 hammer CLI
  • remove or update existing packages from the Satellite 6 User Interface or hammer CLI

The errors may look similar to:

2015-02-27T17:49:45.446256-04:00 example goferd: [ERROR][MainThread] gofer.agent.main:118 - Traceback (most recent
call last): [..trimmed..] "/usr/lib/python2.6/site-packages/qpid/messaging/endpoints.py", line 212, in check_error 
raise e ConnectionError: connection-forced: User connection denied by configured limit(320) 
2015-02-27T17:49:45.446359-04:00 example goferd: [INFO][MainThread] gofer.agent.main:136 - agent started. 
2015-02-27T17:49:55.435878-04:00 example goferd: [WARNING][Thread-2] qpid.messaging:453 - recoverable error[attempt 1]: connection aborted
2015-02-27T17:49:55.436379-04:00 example goferd: [WARNING][Thread-2] qpid.messaging:455 - sleeping 1 seconds

By default the Satellite 6 internal and external Capsules have a configuration default that limits the number of connected agents to 225. If you wish to scale beyond this 225 pre-defined limit you need to adjust a setting in your Satellite and Capsule servers.

Resolution

For deployments with more than 300 content hosts registered to Satellite or Capsule:

Applicable to Red Hat Satellite 6.1 and higher

  • Calculate new maximum number of open files for user qdrouterd as follows: (N x 3) + 100, where N is the number of Content Hosts (explanation of the formula: each Content Host might consume up to 3 file descriptors (FDs) in a rare case, and 100 FDs are required to run the router itself).

  • Increase ulimit on open files / file descriptors for qdrouterd user/service:

    • on RHEL6, add to /etc/security/limits.conf, but before the # End of file line:
qdrouterd	-	nofile	new_maximum_number_of_files

and restart the service:

service qdrouterd restart
  • On RHEL 7, create the /etc/systemd/system/qdrouterd.service.d/limits.conf file and insert the following text:
[Service]
LimitNOFILE=new_maximum_number_of_files

and restart the service:

systemctl daemon-reload
systemctl restart qdrouterd

For deployments with more than 500 content hosts:

  • just on the Satellite, increase ulimit on open files / file descriptors for qpidd user/service:
    • on RHEL6, add to /etc/security/limits.conf:
qpidd	-	nofile	10000
  • on RHEL7, create the /etc/systemd/system/qpidd.service.d/limits.conf and insert the following text :
[Service]
LimitNOFILE=10000
  • 10000 is an example value. It must be higher than N*4+500 where N is the number of content hosts (explanation of the formula: each Content Host consumes up to 4 file descriptors (FDs), and 500 FDs are required to run the broker itself)

  • to apply the change, restart qpidd service:

systemctl daemon-reload
service qpidd restart

For deployments with more than 1900 content hosts:

  • additionally to above, follow resolution of this solution

For deployments with more than 30000 content hosts:

  • additionally to above, follow resolution of this solution

For deployments of Satellite on RHEL7 with more than 32900 content hosts:

  • additionally to above, follow resolution of this solution

For deployments of Satellite with more than 65535 content hosts:

This is technically possible and supported since Satellite 6.3 where This content is not included.this bug has been fixed.

Learn More


See the [Red Hat Satellite 6.1 Installation Guide](https://access.redhat.com/documentation/en-US/Red_Hat_Satellite/6.1/html/Installation_Guide/sect-Red_Hat_Satellite-Installation_Guide-Prerequisites.html#sect-Red_Hat_Satellite-Installation_Guide-Prerequisites-Large_deployments) for more on considerations for large deployments.

Root Cause

There are several limits imposed by qpidd broker that needs to be increased to scale Satellite 6 with respect to number of content hosts. All known are described in This content is not included.this bugzilla and summarized here:

  • for >225 content hosts in Satellite 6.0, qpidd broker hits default value of max-connections (number of client connections to it)
  • for >300 content hosts registered to one Satellite or a Capsule, qdrouterd can hit ulimit of file descriptors (default value 1024)
  • for approximately >500 content hosts, qpidd broker would run out of file descriptors (1024 by default)
  • for >1900 content hosts, qpidd broker will hit kernel limit of fs.aio-max-nr (maximum parallel asynchronous IO operations)
  • for approximately >30000 content hosts, Berkely DB as backend of qpidd broker might hit maximum number of concurrent locks threshold
  • for approximatelly >32900 content hosts, qpidd broker on RHEL7 will hit another kernel limit of vm.max_map_count (the maximum number of memory map areas a process may have)
SBR
Product(s)
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.