Remote execution jobs are failing with error: SocketError getaddrinfo: Name or service not known on Satellite server.

Solution Verified - Updated

Environment

  • Red Hat Satellite 6

Issue

  • Remote Execute fails with error: SocketError getaddrinfo: Name or service not known.

  • From Satellite WEB UI downflow console below error is observed:

       metadata:
      timeout: '2017-05-19 16:06:16 +0400'
      proxy_task_id: 8648573526-c892-42da-a228-d5e20938577a
       proxy_output:
    result:
    - output_type: debug
      output: |-
           Error initializing command #
          SocketError getaddrinfo: Name or service not known
         timestamp: 1502712316.9726257
         exit_status: EXCEPTION
    
  • Remote job fails with error message:

     1: Error initializing command #
     2: SocketError getaddrinfo: Name or service not known
     3: Exit status: EXCEPTION
    
  • REX jobs failing with error:

        PLAY [all] 
        *********************************************************************
    
       TASK [Gathering Facts] 
        *********************************************************
       fatal: [client.redhat.com]: UNREACHABLE! => {"changed": false, "msg": 
       "Failed to connect to the host via ssh: ssh: Could not resolve hostname client.redhat.com: Name or service not known", "unreachable": true}
    	to retry, use: --limit @/tmp/foreman-playbook-79716c3a-378a-4a96-b4b0-0fa9cb6624c8.retry
    
       PLAY RECAP *********************************************************************
       client.redhat.com    : ok=0    changed=0    unreachable=1    failed=0   
    
       Exit status: 4
    

Resolution

Solution 1:

  • Ensure the root login is enabled on the target host for root user.

       # grep PermitRootLogin /etc/ssh/sshd_config 
       # PermitRootLogin yes
    
  • On the Red Hat Satellite CLI, remove the contents of the .ssh/authorized_keys and run the below command:

     # ssh-copy-id -i ~foreman-proxy/.ssh/id_rsa_foreman_proxy.pub root@target.example.com
    
  • On the Content Host, check if SELinux is in Enforcing mode:

     # sestatus
    
  • If yes, then run:

     # restorecon -Rv ~/.ssh
    

Solution 2:

If the issue persists, it could be that the Red Hat Satellite cannot communicate with the Content Host.

  • The DNS Server should be present in /etc/resolv.conf on the Satellite:

    # Generated by NetworkManager
    nameserver XX.XX.XX.XX
    
  • Ensure that Red Hat Satellite Server can resolve hostname of content host.

  • Verify that the remote machine on which the REX jobs are getting executed has a subnet(or a correct subnet) associated with it.

    Red Hat Satellite WebUI > Hosts > All Hosts > client.redhat.com > Edit > Interfaces > Select the desired Interface > Edit > Verify/Add the correct IPV4 subnet > Ok > Submit.
    
  • Verify the following settings on Satellite GUI:

      Red Hat Satellite WebUI > Administer > Settings > RemoteExecution > Fallback to Any Capsule > Yes.
    

Solution 3:

Root Cause

  • Root login is disabled on the target host.
  • SELinux is in Enforcing mode.
  • The Red Hat Satellite can not resolve the hostname of the target host.
  • Subnet is not associated with the remote machine.
  • The REX job is executed through a random Capsule server that does not have the client machine's name resolution either in /etc/hosts file or DNS of the respective Red Hat Capsule server.

Diagnostic Steps

  • Check if SELinux is in Enforcing mode:

     # sestatus
    
  • Confirm if the Red Hat Satellite can resolve the hostname of the target host:

     # ping -c3 <FQDN of the Host>
     # ping -c3 <IP of the Host>
    
SBR
Product(s)
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.