[Satellite6] all pulp tasks are failing, wsgi:pulp process segfaults / segfault at 10 ip ... error 4 in libpython2.7.so.1.0

Solution Verified - Updated

Environment

  • Red Hat Satellite or Proxy 6

Issue

  • all pulp tasks are failing with 500 Internal Server Error
  • checking system logs, wsgi:pulp process is segfaulting
  • restarting httpd service does not help - the wsgi script segfaults just after the restart

Resolution

Workaround

This workaround in deleting and recreating the queue is not recommended until one knows what consequences it raises (all deleted pulp tasks statuses are lost):

qpid-config del queue pulp.task --ssl-certificate=/etc/pki/pulp/qpid/client.crt -b amqps://localhost:5671 --force
qpid-config add queue pulp.task --ssl-certificate=/etc/pki/pulp/qpid/client.crt -b amqps://localhost:5671 --durable --argument exclusive=False
foreman-maintain service restart

Final resolution

When on Sat6.1, wait until This content is not included.underlying bugzilla is fixed, or upgrade to Satellite 6.2 once available.

When on Sat6.3 or newer, a This content is not included.similar bugzilla has been closed as WONTFIX.

For more KB articles/solutions related to Red Hat Satellite 6.x Pulp 2.0 Issues, please refer to the Consolidated Troubleshooting Article for Red Hat Satellite 6.x Pulp 2.0-related Issues

Root Cause

Having a client machine in locale using nonASCII characters, such that some yum or goferd error message can contain some of them. This can happen when e.g. trying to install a package to a system with French locale and full disk. A message with error string containing nonASCII character is then sent by goferd into pulp.task qpid queue.

When wsgi:pulp script fetches such a message, it tries to convert it to (ASCII) string (using str python method), what raises exception / segfault.

Since the message isn't deleted from the queue, it remains there and the wsgi script reads it again just after its start-up.

Diagnostic Steps

  • /var/log/messages might log segfaults:
Jul  7 14:30:39 mysatellite kernel: httpd[19377]: segfault at 10 ip 00007f24db9d70ef sp 00007f24cc9ed440 error 4 in libpython2.7.so.1.0[7f24db976000+179000]
  • no (wsgi:pulp) -DFOREGROUND process running, or often changing its PID

  • /var/log/httpd/error_log full of:

[Mon Apr 02 09:14:40.627315 2018] [core:notice] [pid 9651] AH00052: child pid 9680 exit signal Segmentation fault (11)
  • pulp.task qpid queue having some unprocessed message (see the first numerical column standing for queue depth):
# qpid-stat -q --ssl-certificate=/etc/pki/katello/qpid_client_striped.crt -b amqps://localhost:5671 | grep pulp.task
  pulp.task                                                                            Y                      1     1      0    2.02k  2.02k       0         0     1
#
(gdb) bt
#0  0x00007f24db9d70ef in UnicodeEncodeError_str (
    self=exceptions.UnicodeEncodeError('ascii', u'Erreurs de la transaction de test\xa0:   installing package kernel-2.6.32-573.22.1.el6.x86_64 needs 5MB on the /boot filesystem\n', 33, 34, 'ordinal not in range(128)'))
    at /usr/src/debug/Python-2.7.5/Objects/exceptions.c:1660
#1  0x00007f24db9fb61a in _PyObject_Str (
    v=exceptions.UnicodeEncodeError('ascii', u'Erreurs de la transaction de test\xa0:   installing package kernel-2.6.32-573.22.1.el6.x86_64 needs 5MB on the /boot filesystem\n', 33, 34, 'ordinal not in range(128)')) at /usr/src/debug/Python-2.7.5/Objects/object.c:430
#2  0x00007f24db9fb6ea in PyObject_Str (v=<optimized out>) at /usr/src/debug/Python-2.7.5/Objects/object.c:451
#3  0x00007f24dba0e720 in string_new (type=0x7f24dbd0aae0 <PyString_Type>, args=<optimized out>, kwds=<optimized out>)
    at /usr/src/debug/Python-2.7.5/Objects/stringobject.c:3707
#4  0x00007f24dba15e53 in type_call (type=0x7f24dbd0aae0 <PyString_Type>, 
    args=(exceptions.UnicodeEncodeError('ascii', u'Erreurs de la transaction de test\xa0:   installing package kernel-2.6.32-573.22.1.el6.x86_64 needs 5MB on the /boot filesystem\n', 33, 34, 'ordinal not in range(128)'),), kwds=0x0)
    at /usr/src/debug/Python-2.7.5/Objects/typeobject.c:729
#5  0x00007f24db9c00c3 in PyObject_Call (func=func@entry=<type at remote 0x7f24dbd0aae0>, 
    arg=arg@entry=(exceptions.UnicodeEncodeError('ascii', u'Erreurs de la transaction de test\xa0:   installing package kernel-2.6.32-573.22.1.el6.x86_64 needs 5MB on the /boot filesystem\n', 33, 34, 'ordinal not in range(128)'),), kw=kw@entry=0x0)
    at /usr/src/debug/Python-2.7.5/Objects/abstract.c:2529
#6  0x00007f24dba5438c in do_call (nk=<optimized out>, na=1, pp_stack=0x7f24cc9ed690, func=<type at remote 0x7f24dbd0aae0>)
    at /usr/src/debug/Python-2.7.5/Python/ceval.c:4316
#7  call_function (oparg=<optimized out>, pp_stack=0x7f24cc9ed690) at /usr/src/debug/Python-2.7.5/Python/ceval.c:4121
#8  PyEval_EvalFrameEx (
    f=f@entry=Frame 0x7f24c000cb00, for file /usr/lib/python2.7/site-packages/gofer/rmi/async.py, line 240, in __str__ (self=<Failed(origin=None, exval=exceptions.UnicodeEncodeError('ascii', u'Erreurs de la transaction de test\xa0:   installing package
SBR
Product(s)
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.