About High Availability with OpenStack Platform
"Creating a Highly Available Red Hat OpenStack Platform Configuration" provides an excellent overview of high availability architecture and setup procedures for OpenStack. This article goes into more detail about any specific requirements or recommendations for individual components within the OpenStack deployment when planning high availability.
Technology Considerations
OpenStack was originally designed to scale elastic applications and high availability of the uderlying services was not one of the major design tenants, but customers have demanded that the control plan be made available. OpenStack can meet these requirements for its own infrastructure services, meaning that an uptime of 99.99% is feasible for the OpenStack infrastructure but not that this does not guarantee availability for individual guest instances.
Currently our official recommendation to achieve high availability with OpenStack is to use Pacemaker to control all services, with some being configured as active-active clones and others as active-passive. Our official tool RHEL OSP Installer deploys this configuration. This solution provides a completely controlled clustering environment that is tightly integrated with all OpenStack services along with automated recovery up to the point of fencing (rebooting) a node after a number of restarts have occurred on a service.
Resources Requiring Availability
Load Balancing
A very important aspect of a highly available cluster is the ability to redirect requests to a pool of backend resources. By default we recommend HAProxy using pacemaker to manage virtual IPs for resources, but there are other supported alternatives such as keepalived. It is also possible for customers to use hardware load balancers such as F5.
Stateless API Services
Most API services can scale out horizontally and be made active/active across multiple servers. Ceilometer central must run active/passive.
Neutron
The Neutron L3 agent does not have native way to make routes highly available and must be run active/passive. Juno introduces VRRP and DVR for replicating routes.
AMQP Messaging
RabbitMQ should run active-active mirrored queues.
Databases
MariaDB Galera server is configured with synchronous replication but only one service is ever able to write at a time while the others are in standby mode. This is similar to MongoDB which is used for Ceilometer.
Clustering Technologies
Pacemaker provides a full suite of features to monitor and recovery resources. However, for various reasons some customers choose to use other options such as keepalived and systemd along with their own monitoring to achieve availability.
Pacemaker and System Tools
Unfortunately, this solution comes with a number of caveats. First, pacemaker is an entirely new technology to introduce into an environment and requires education on how to operate. Pacemaker also does not integrate with standard Linux system tools such as systemd or, in the case of OpenStack, openstack-service commands. This means all services must be manipulated by the pcs command only. If systemctl is used to start or stop a service, the resource may fail and pacemaker will attempt to recover it to the point of fencing the node. This means any infrastructure tools that an operator uses require integration/development to work with pacemaker or might not be useful.
Fencing
Consider what the purpose of fencing is. In traditional clustering environments fencing is used to ensure data integrity of the cluster so any service failure is dealt with a system reboot to make sure no two systems are writing to share storage, for example. This makes sense in active-passive environments to prevent multiple writes to a database. However, in OpenStack there are a number of layers to consider. For the most part, the API services are designed to scale out horizontally. Consider if an API service fails multiple times to the point that it would normally be fenced by pacemaker. Would an API service failure cause data corruption? One might argue that it is more disruptive to reboot a server that is hosting other infrastructure services such as a database for a simple API service failure when there are other server providing a cloned active-active copy of it. The load balancer would detect this failure and stop sending packets to it. What about a load balancer failure? Messaging service? It is even arguable that the mariadb-galera-server which is only configured to have a single writer can handle its own failures rather than requires pacemaker to do automatic failure recovery. The point here is to determine if automated recovery is right for your environment or if monitoring/notifying and manual investigation might be a better route.
Considerations
Clustering Technology Pros/Cons
| Pacemaker | Keepalived/Systemd |
|---|---|
| Pros | Automated recovery, resource groups, resource constraints, single point of control for resources, fencing for resource failures |
| Cons | Additional complexity, new technology to layer on top of OpenStack, education required, no integration with Linux system tools, no integration with openstack-service |
Controller Sizing
The number of controllers for a highly available OpenStack deployment is typically three. This is driven by mariadb and mongodb services requiring at least three nodes. It is possible to expand to more numbers as needed, given that the total number of controllers is an odd number.