How can I debug a service using rg_test in a RHEL 4, 5, or 6 High Availability cluster with rgmanager?
Environment
- Red Hat Cluster Suite 4+
- Red Hat Enterprise Linux (RHEL) 4, 5, or 6 with the High Availability Add On
rgmanager
Issue
- How can I debug a clustered service for RHEL 4?
- How can I debug a clustered service for RHEL 5?
- How can I debug a clustered service for RHEL 6?
Resolution
The rg_test command allows the testing of resource groups that are managed by rgmanger in your cluster. The command can perform a "start", "stop", or "status" operation on a clustered service or resource that is defined in a cluster configuration file.
NOTE: rg_test should be used for testing or by experienced users that are fully aware of the implications of managing resources outside the control of rgmanager. This tool is most useful for detection of specific service or resource errors preventing proper operation, but should generally not be used for routine starting and stopping of resources except under emergency conditions.
The rg_test examples below reference the following resource/service definitions from an example /etc/cluster/cluster.conf
<rm log_facility="local4" log_level="7">
<failoverdomains/>
<resources>
<ip address="192.168.1.201" monitor_link="1"/>
<fs device="/dev/vdc1" force_fsck="0" force_unmount="0" fsid="5345" fstype="ext3" mountpoint="/mnt/ext3vol1" name="ext3fs" options="" self_fence="0"/>
</resources>
<service autostart="0" exclusive="0" name="demo1">
<ip ref="192.168.1.201">
<fs ref="ext3fs"/>
</ip>
</service>
</rm>
The test command will display all the resources in the configuration file specified in the rg_test command:
# rg_test test /etc/cluster/cluster.conf
Below are examples of using rg_test for starting, stopping, and running status on a clustered service or resource.
Note: Before using rg_test, you must always ensure the service in question is not running anywhere in the cluster, otherwise there is a high risk of corrupting shared storage or causing other conflicts. Use clusvcadm to disable the service first:
# clusvcadm -d demo1
Check the status of the service demo1. Clustat should always show the service as disabled even if it has been properly started and it is currently running when doing these commands on the service demo1:
# clustat -s demo1
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:demo1 (none) disabled
Service level tests
Start the service demo1. This will produce a lot of verbose logging to the console.
# rg_test test /etc/cluster/cluster.conf start service demo1
Running in test mode.
Starting demo1...
<debug> Link for eth1: Detected
<info> Adding IPv4 address 192.168.1.201/24 to eth1
<debug> Pinging addr 192.168.1.201 from dev eth1
<debug> Sending gratuitous ARP: 192.168.1.201 52:54:00:b8:f2:48 brd ff:ff:ff:ff:ff:ff
<info> mounting /dev/vdc1 on /mnt/ext3vol1
<debug> mount -t ext3 /dev/vdc1 /mnt/ext3vol1
<info> quotaopts =
Start of demo1 complete
Check the status of the service demo1. This will produce a lot of verbose logging to the console.
# rg_test test /etc/cluster/cluster.conf status service demo1
Running in test mode.
Checking status of demo1...
<debug> Checking 192.168.1.201, Level 10
<debug> 192.168.1.201 present on eth1
<debug> Link for eth1: Detected
<debug> Link detected on eth1
<debug> Local ping to 192.168.1.201 succeeded
Status of demo1 is good
Stop the service demo1. This will produce a lot of verbose logging to the console.
# rg_test test /etc/cluster/cluster.conf stop service demo1
Running in test mode.
Stopping demo1...
<info> unmounting /mnt/ext3vol1
<info> Removing IPv4 address 192.168.1.201/24 from eth1
Stop of demo1 complete
Resource Level tests
Resources can have the same actions performed as services: stop, start, and status. The resource type can be specified in place of service.
To start the ip resource 192.168.1.201:
# rg_test test /etc/cluster/cluster.conf start ip 192.168.1.201
Running in test mode.
Starting 192.168.1.201...
<debug> Link for eth1: Detected
<info> Adding IPv4 address 192.168.1.201/24 to eth1
<debug> Pinging addr 192.168.1.201 from dev eth1
<debug> Sending gratuitous ARP: 192.168.1.201 52:54:00:b8:f2:48 brd ff:ff:ff:ff:ff:ff
<info> mounting /dev/vdc1 on /mnt/ext3vol1
<debug> mount -t ext3 /dev/vdc1 /mnt/ext3vol1
<info> quotaopts =
Start of 192.168.1.201 complete
Check the status of the ip resource 192.168.1.201:
# rg_test test /etc/cluster/cluster.conf status ip 192.168.1.201
Running in test mode.
Checking status of 192.168.1.201...
<debug> Checking 192.168.1.201, Level 10
<debug> 192.168.1.201 present on eth1
<debug> Link for eth1: Detected
<debug> Link detected on eth1
<debug> Local ping to 192.168.1.201 succeeded
Status of 192.168.1.201 is good
Stop the ip resource 192.168.1.201:
# rg_test test /etc/cluster/cluster.conf stop ip 192.168.1.201
Running in test mode.
Stopping 192.168.1.201...
<info> unmounting /mnt/ext3vol1
<info> Removing IPv4 address 192.168.1.201/24 from eth1
Stop of 192.168.1.201 complete
Check the status of the fs resource ext3fs when the status check returns success:
# rg_test test /etc/cluster/cluster.conf status fs ext3fs
Running in test mode.
Checking status of ext3fs...
Status of ext3fs is good
Check the status of the fs resource ext3fs when the status check returns failure:
# rg_test test /etc/cluster/cluster.conf status fs ext3fs
Running in test mode.
Checking status of ext3fs...
<err> fs:ext3fs: /dev/vdc1 is not mounted on /mnt/ext3vol1
Status check of ext3fs failed
rg_test can be useful when doing maintenance on a clustered service. Here is an example that demonstrates this: How to stop single resources of a cluster-service for maintenance?
Note: When testing with rg_test is complete, the service should always be disabled one more time to ensure the resources are all freed up. Otherwise, a subsequent start of the service on another node may result in corruption of shared storage or other conflicts. Disabling the service with clusvcadm can accomplish this:
# clusvcadm -d demo1
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.