Remote Execution job on Red Hat Satellite with large batch of hosts fails with Foreman::Exception: The capsule task failed.

Solution Verified - Updated

Environment

  • Red Hat Satellite 6.x

Issue

  • Remote Execution jobs from Red Hat Satellite 6 fails on certain clients with the following error when the batch size is large (greater than 250).

    Foreman::Exception
    [Foreman::Exception]: The capsule task failed. 
    

Resolution

  • Increase the Capsule request timeout setting on the Satellite :

    Satellite WebUI: Administer ▶ Settings ▶ General ▶ Capsule request timeout: (Default value is 60)

For more KB articles/solutions related to Red Hat Satellite 6.x Remote Execution Issues, please refer to the Red Hat Satellite Consolidated Troubleshooting Article for Red Hat Satellite 6.x Remote Execution Issues

Root Cause

  • When remote execution jobs are initiated, Satellite will wait for a specific time interval for a response from the client. The value is specified by the Capsule request timeout value. If a host in the batch does not respond within the timeout value, the task is timed out.

Diagnostic Steps

  • Task Export reports the task with the following error:

    Failed to initialize: Foreman::Exception - ERF42-2156 [Foreman::Exception]: ERF42-8892 [Satellite::Exception]: The capsule task 
    5cf62518-d953-4574-a0c2-3251fb06e08a failed.
    Foreman::Exception: ERF42-8892 [Foreman::Exception]: The capsule task 5cf62518-d953-4574-a0c2-3251fb06e08a failed.
    
  • Timed out errors are reported in the Satellite smart proxy log:

      2019-09-22 23:00:59.550 #1821] ERROR --  action: Script execution failed
      	10.166.32.151 - - [22/Sep/2019:23:04:23 CEST] "POST /tasks/ HTTP/1.1" 200 50
     [2019-09-22 23:04:24.621 #1821] ERROR -- dynflow: error while initalizing command Net::SSH::ConnectionTimeout 
      Net::SSH::ConnectionTimeout:
     [2019-09-22 23:04:27.611 #1821] ERROR --  action: Script execution failed
     [2019-09-22 23:15:29.235 #1821] ERROR --  action: Timed out reading data from server (RestClient::Exceptions::ReadTimeout)
     [2019-09-22 23:15:53.331 #1821] ERROR --  action: Timed out connecting to server (RestClient::Exceptions::OpenTimeout)
     [2019-09-22 23:16:36.957 #1821] ERROR --  action: Timed out connecting to server (RestClient::Exceptions::OpenTimeout)	
     [2019-09-22 23:16:37.196 #1821] ERROR --  action: Timed out connecting to server (RestClient::Exceptions::OpenTimeout)
     [2019-09-22 23:16:57.221 #1821] ERROR --  action: Timed out connecting to server (RestClient::Exceptions::OpenTimeout)
     [2019-09-22 23:16:57.362 #1821] ERROR --  action: Timed out connecting to server (RestClient::Exceptions::OpenTimeout)
    
  • Timed out errors are reported in the Satellite Production log as well:

    10.166.32.150 - - [22/Sep/2019:23:01:18 CEST] "POST /tasks/ HTTP/1.1" 200 50
    [2019-09-22 23:02:56.180 #129150] ERROR --  action: Script execution failed
    [2019-09-22 23:02:56.180 #129150] ERROR --  action: Script execution failed
    [2019-09-22 23:02:56.180 #129150] ERROR --  action: Script execution failed
    10.166.32.150 - - [22/Sep/2019:23:07:50 CEST] "POST /tasks/ HTTP/1.1" 200 50
    [2019-09-22 23:11:50.786 #129150] ERROR --  action: Script execution failed
    [2019-09-22 23:15:25.620 #129150] ERROR --  action: Timed out reading data from server (RestClient::Exceptions::ReadTimeout)
    [2019-09-22 23:15:28.614 #129150] ERROR --  action: Timed out reading data from server (RestClient::Exceptions::ReadTimeout)
    [2019-09-22 23:15:29.369 #129150] ERROR --  action: Timed out reading data from server (RestClient::Exceptions::ReadTimeout)
    [2019-09-22 23:15:31.435 #129150] ERROR --  action: Timed out reading data from server (RestClient::Exceptions::ReadTimeout)
    [2019-09-22 23:15:46.419 #129150] ERROR --  action: Timed out reading data from server (RestClient::Exceptions::ReadTimeout)
    [2019-09-22 23:16:44.075 #129150] ERROR --  action: Timed out connecting to server (RestClient::Exceptions::OpenTimeout)
    [2019-09-22 23:16:57.310 #129150] ERROR --  action: Timed out connecting to server (RestClient::Exceptions::OpenTimeout)
    10.166.32.150 - - [22/Sep/2019:23:24:14 CEST] "POST /tasks/status HTTP/1.1" 200 931
    
SBR
Product(s)
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.