Pods returns a 502 or 400 error when accessing via the application route after upgrading the RHOCP cluster to version 4.14

Solution Verified - Updated 7 Nov 2024

Environment

Red Hat OpenShift Container Platform (RHOCP)
- 4.12
- 4.13
- 4.14
HAProxy 2.6

Issue

After upgrading the RHOCP cluster to version 4.14, the external access to pods is lost when they expose a route with duplicated encoding headers.
On OpenShift 4.12 or 4.13, an alert is firing: DuplicateTransferEncodingHeadersDetected (Severity: Warning).
What is haproxy_backend_duplicate_te_header_total metric in Prometheus?

Resolution

Duplicated `Transfer-Encoding` headers

This problem is not an issue with Red Hat OpenShift Container Platform or networking components. It is an issue with application pods serving duplicated headers instead (see also the Root Cause section for additional information):

In RHOCP 4.14, routers use HAProxy version 2.6. In RHOCP 4.13 and previous versions, the routers use HAProxy 2.2.
From HAProxy version 2.5 onwards, a check is introduced to verify that applications do not have duplicated headers.
Look for the duplicated header in the application configuration file and remove them to avoid a 502 Bad Gateway or 400 Bad Request error. Eg. in an httpd application, go to /etc/httpd/conf and check the httpd.conf contains the duplicated headers:
```
...
<Directory />
    Header set Transfer-Encoding chunked
    Header add Transfer-Encoding chunked
<Directory />
...
```
See also this solution related to duplicated transfer-encoding headers on jboss/EAP and how to mitigate at application layer.
Bug This content is not included.OCPBUGS-40850 has been opened to provide detection + mitigation efforts here that should identify duplicated headers. Ultimately, correcting the header appendation/duplication handling is the responsibility of the application developers, but Red Hat is working to find options to make identification of which applications need to be addressed easier. Expect updates here in this KCS to follow.

Steps must be taken at application layer to identify and mitigate this behavior, and are largely dependent upon the architecture of the custom deployments.

Detection and Mitigation

As of OpenShift 4.13.53 and 4.12.68, the OpenShift router-default pod image will start to poll a new metric value for: duplicate_te_header_total which is a counter that will increase from 0 every time a route is called that is experiencing this duplicated header issue:

After Upgrading a cluster to 4.13.53+ or 4.12.68+, the following Prometheus query can be used to review the counter values by namespace. It should be expected to see 0 across all routes. Any route with an increasing metric counter indicates a configuration issue on the application side, that must be addressed by the application owner.
```
sum by (route) (haproxy_backend_duplicate_te_header_total{exported_namespace="%s"})
```
After fixing issues with custom applications to ensure they are not sending duplicate transfer-encoding: chunked headers (see the Root Cause section for additional information), it should be observed that the counter stops growing for these routes which indicates successful mitigation. Give about 5 minutes before checking again to ensure having the latest data.

`DuplicateTransferEncodingHeadersDetected` alert is observed

DuplicateTransferEncodingHeadersDetected (severity: Warning)

This alert will fire in OpenShift 4.13.53+ clusters if any route records a duplicate Transfer-Encoding header issue and must be silenced after mitigating the origin Application(s) - read below to learn more about this. This is a proactive alert to warn application teams that their apps will fail to be reached by external clients using routes unless steps are taken to update/fix application design relating to duplicated transfer-encoding: chunked headers in their application responses.
This alert, once fired, will be cleared only after upgrade to 4.14+.
To clear the alert prior to upgrade, it is needed to silence the alert even only after mitigating issues within application Team, as it monitors the metric: haproxy_backend_duplicate_te_header_total of type counter (a counter is not reset or flushed, and this value will only stop increasing after fixing the offending application). However, restarting router-default pods may remove the alert when the new pods come online:
```
$ oc -n openshift-ingress rollout restart deployment router-default
```
To silence the alert after identifying and resolving issues with application, please refer to silencing alerts documentation.
To understand more about this problem refer to the Root Cause section.

Backporting this detection feature to earlier versions of OpenShift

It is possible to **temporarily** backport this change into an earlier version of OpenShift for **TESTING ONLY**. Note that this procedure is **NOT SUPPORTED** and must be done with caution or the involvement of Red Hat Support. The steps are provided as a method for proactive testing/detection and troubleshooting only and are not designed for long-term implementation or production platforms:

Pull the image tag for the latest version of router-default pods for 4.13.53+:

$ VERSION=$(oc adm release info 4.13.53 --image-for haproxy-router)
$ echo $VERSION

scale the following operators to 0:

$ oc scale --replicas 0 -n openshift-cluster-version deployments/cluster-version-operator
$ oc scale --replicas 0 -n openshift-ingress-operator deployments ingress-operator

replace the image version for the router-default pods (or target shard pods):

$ oc -n openshift-ingress set image deployment/router-default router=${VERSION}

Observe the router-default pods will restart with the new image
```
$ watch oc get pod -n openshift-ingress
```
Run some test curls to all the custom applications (or wait a bit so that the metrics counters get increased for the backends with the new router image), then query prometheus to see what values are increasing.
```
sum by (route) (haproxy_backend_duplicate_te_header_total{exported_namespace="%s"})
```

[OLDER DETECTION METHOD - MORE LIKELY TO PROVIDE FALSE-POSITIVES/DEPRECIATED but still supported]:

It is possible to use the Access Logging feature of the ingresscontroller to capture specific headers. Make the following change to the default ingresscontroller:

    $ oc edit ingresscontroller/default -n openshift-ingress-operator

      spec:
        ...
        logging:
          access:
            destination:
              type: Container
            httpCaptureHeaders:
              request:
              - maxLength: 20
                name: transfer-encoding
              response:
              - maxLength: 20
                name: transfer-encoding

Note that the name field is not case-sensitive, this will collect header reports for Transfer-Encoding or transfer-encoding

That captures both request headers (client originated - whether from pods inside the cluster or users externally), and response headers (replies from pods) - which means it will be seen every route that uses transfer-encoding to allow to investigate the behavior of those routes/pods in question with a verbose curl, which should return the double-header reply or not to tell if the pods will need to be updated.
This change will redeploy the router-default pods and attach a sidecar container called logs that will be capturing access logs for traffic moving through the router-default pods in the namespace openshift-ingress. You can read more about access logging configuration here.

After the router-pods come up, wait some time for the routes to be queried and then review the logs. Grep for the value that transfer-encoding can have assigned as a header: chunked|gzip

    ###test curl with duplicate encoding headers injected:
    $ curl -H "transfer-encoding: chunked" -H "transfer-encoding: chunked" --header 'Content-Type: application/json' --data '{"code": 0}' -kv http://httpd-example.example.com

    ###access logs reporting on this curl:
    $ oc logs router-default-b8d658d89-tcbvf -c logs --follow | grep -Ei "chunked|gzip"
2024-08-16T19:45:10.790117+00:00 worker-1 worker-1.example.com haproxy[30]: 10.xx.xx.59:60354 [16/Aug/2024:19:45:10.786] public be_http:sunbro:httpd-ex/pod:httpd-ex-66b76bfc9b-dr75s:httpd-ex:8080-tcp:10.yy.yy.100:8080 0/0/1/1/3 200 37723 - - --NI 1/1/0/0/0 0/0 {chunked} "POST / HTTP/1.1"

Note that only ONE instance of chunked is seen in this log output despite it showing up twice in the curl:

* Connected to httpd-example.example.com (10.xx.xx.8) port 80
> POST / HTTP/1.1
> Host: httpd-example.example.com
> User-Agent: curl/8.6.0
> Accept: */*
> transfer-encoding: chunked
> transfer-encoding: chunked
> Content-Type: application/json
> 
< HTTP/1.1 200 OK

Review the log output for any instances of chunked or gzip to point to the application pods, routes and client IPs that may be shipping transport-encoding headers to custom applications, which will need to be then validated by the app teams.
This change will not tell which of the custom applications may be shipping a duplicated header, just that these applications may encounter an issue post-upgrade if the header is injected more than once.
It is needed to further analyze this traffic using tcpdumps at the destination pod layer to see if two headers are arriving, but this may be sufficient to point to application developer teams at which pods need to be reviewed.

Root Cause

Disclaimer: Links contained herein to external website(s) are provided for convenience only. Red Hat has not reviewed the links and is not responsible for the content or its availability. The inclusion of any link to an external website does not imply endorsement by Red Hat of the website or their entities, products or services. You agree that Red Hat is not responsible or liable for any loss or expenses that may result due to your use of (or reliance on) the external site or content.

When upgrading the OpenShift 4 cluster to version 4.14, the new HAProxy version (bumped from 2.2 to 2.6) does not allow chunked encoding to be applied more than once to a message body as per Content from datatracker.ietf.org is not included.RFC 7230, so requests will fail with error 502 Bad Gateway (Content from docs.haproxy.org is not included.see HAProxy 2.6 documentation). Note also that the chunked encoding header must be the LAST header applied as per the RFC.

A recipient MUST be able to parse the chunked transfer coding
   (Section 4.1) because it plays a crucial role in framing messages
   when the payload body size is not known in advance.  A sender MUST
   NOT apply chunked more than once to a message body (i.e., chunking an
   already chunked message is not allowed).  If any transfer coding
   other than chunked is applied to a request payload body, the sender
   MUST apply chunked as the final transfer coding to ensure that the
   message is properly framed.  If any transfer coding other than
   chunked is applied to a response payload body, the sender MUST either
   apply chunked as the final transfer coding or terminate the message
   by closing the connection.

Content from datatracker.ietf.org is not included.Content from datatracker.ietf.org is not included.https://datatracker.ietf.org/doc/html/rfc7230#section-3.4

   A message body that uses the chunked transfer coding is incomplete if
   the zero-sized chunk that terminates the encoding has not been
   received.  A message that uses a valid Content-Length is incomplete
   if the size of the message body received (in octets) is less than the
   value given by Content-Length.  A response that has neither chunked
   transfer coding nor Content-Length is terminated by closure of the
   connection and, thus, is considered complete regardless of the number
   of message body octets received, provided that the header section was
   received intact.

Content from datatracker.ietf.org is not included.Content from datatracker.ietf.org is not included.https://datatracker.ietf.org/doc/html/rfc7230#section-4.1

   The chunk-size field is a string of hex digits indicating the size of
   the chunk-data in octets.  The chunked transfer coding is complete
   when a chunk with a chunk-size of zero is received, possibly followed
   by a trailer, and finally terminated by an empty line.

   A recipient MUST be able to parse and decode the chunked transfer
   coding.

Example failure response:

$ curl -I httpd-app-route.apps.ocp.redhat.com
  HTTP/1.1 502 Bad Gateway
  content-length: 107
  cache-control: no-cache
  content-type: text/html

Diagnostic Steps

In an application installed in RHOCP 4.14, verify that it is configured with duplicated headers. Eg: in an httpd app, go to /etc/httpd/conf and check the httpd.conf contains the duplicated headers (it could be gzip or chunked):
```
...
<Directory />
    Header set Transfer-Encoding chunked
    Header add Transfer-Encoding chunked
<Directory />
...
```

Make a request against the application route and check it fails with 502 Bad Gateway.

$ curl -v httpd-headers.apps.ocp.redhat.com
* Rebuilt URL to: httpd-headers.apps.ocp.redhat.com/
* Trying 10.8.50.117...
* TCP_NODELAY set
* Connected to httpd-headers.apps.ocp.redhat.com (10.8.50.117) port 80 (#0)
> GET / HTTP/1.1
> Host: httpd-headers.apps.ocp.redhat.com
> User-Agent: curl/7.61.1
> Accept: */*
> 
< HTTP/1.1 502 Bad Gateway
< content-length: 107
< cache-control: no-cache
< content-type: text/html
< 
<html><body><h1>502 Bad Gateway</h1>
The server returned an invalid or incomplete response.
</body></html>
* Connection #0 to host httpd-headers.apps.ocp.redhat.com left intact

In RHOCP 4.12 and 4.13, the HAProxy is configured with version 2.2, which allows duplicated headers therefore requests are successful, but starting with HAProxy 2.5 it is not allowed:

$ curl -I httpd-headers.apps.ocp.redhat.com
HTTP/1.1 200 OK
date: Fri, 09 Feb 2024 10:03:59 GMT
server: Apache/2.4.34 (Red Hat) OpenSSL/1.0.2k-fips
last-modified: Fri, 09 Feb 2024 09:35:47 GMT
etag: "924b-610efa71a22c0"
accept-ranges: bytes
transfer-encoding: gzip
transfer-encoding: gzip
content-type: text/html; charset=UTF-8
set-cookie: d0539e60b69cf5da857de86928b08361=baafe9538e4c01cc8d127a1b28e8188f; path=/; HttpOnly
cache-control: private

Check further that chunked is the last-sent transfer encoding flag - if it isn't sent last, curls may respond with a 502:
```
Transfer-Encoding: gzip, chunked
```

If HAProxy ingress router access log are enabled, PH will be the first two characters in the Content from docs.haproxy.org is not included.session state when a 502 is logged:

    - On the first character, a code reporting the first event which caused the
      session to terminate :
          P : the session was prematurely aborted by the proxy, because of a
              connection limit enforcement, because a DENY filter was matched,
              because of a security check which detected and blocked a dangerous
              error in server response which might have caused information leak
              (e.g. cacheable cookie).

    - on the second character, the TCP or HTTP session state when it was closed :
          H : the proxy was waiting for complete, valid response HEADERS from the
              server (HTTP only).

SBR

Shift Networking

Product(s)

Red Hat OpenShift Container Platform

Components

Category

Troubleshoot

Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.