Red Hat Ceph Storage monitor failure scenarios

Solution Unverified - Updated

Environment

  • Red Hat Ceph Storage 1.2.3

  • Red Hat Ceph Storage 1.3

Issue

  • Detail some of the scenarios in which a Ceph monitor may fail.

Resolution

  • What should be the strategy in case of one MON failure? Can the cluster operate with only two MONs for a long period of time (i.e. few days)?

The RHCS cluster will operate, but it is not suggested to carry on with a failure. The monitor node should be brought up and running as soon as possible. The cluster will not tolerate a failure of 2, in case of a 3 MON configuration.

  • What happens in case of two MONs failing simultaneously? Will the cluster operate fine or will it go read only?

The cluster wont operate, and it would be inaccessible until a monitor quorum is available. That means 2/3 of the MONs should be up in the case of a 3 MON setup, or in the case of 5 mons, 3/5.

  • Does it make sense to have "standby" MON service and in case of failure add it to the cluster?

Not really. In most failure scenarios, the monitor should be rebuilt if it was an emergency. It's probably just as easy to deploy another one however. You could just have 5 active monitors if this is a major concern.

SBR
Category
Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.