Running a tcpdump container as a sidecar on ROSA/OSD/ARO clusters

Solution Verified - Updated

Environment

  • Red Hat OpenShift Service on AWS (ROSA)
    • 4
  • Red Hat OpenShift Dedicated (OSD)
    • 4
  • Azure Red Hat OpenShift (ARO)
    • 4

Issue

  • Sometimes it is useful to gather pcaps from the pods of running applications / APIs, but, generally, the images used do not include these networking tools. This tcpdump procedure will capture network details for a running application pod without needing to re-build images.

Resolution

Note: Use this procedure if Running tcpdump inside an OpenShift pod to capture network traffic is not possible due to user permissions.

This procedure is useful for clusters when the cluster owner is not authorized to run tcpdump at the node level.

Prepare an example application

This section is not required when using this to test real applications in a cluster. These steps are only required to test and validate the process in you cluster environment.

These steps just prepare a simple application that is then used in the next section to implement a sidecar container.

  1. Prepare the environment:

    $ mkdir ex1
    $ cd ex1
    
  2. Create the example application. This example just logs a timestamp to the standard out:

    $ oc new-project example-app && \
    cat <<EOF | oc apply -f -
    kind: "DeploymentConfig"
    apiVersion: apps.openshift.io/v1
    metadata:
      labels:
        app: basic-app
      name: basic-app
    spec:
      replicas: 1
      selector:
        deploymentconfig: basic-app
      strategy:
        activeDeadlineSeconds: 21600
        resources: {}
        rollingParams:
          intervalSeconds: 1
          maxSurge: 25%
          maxUnavailable: 25%
          timeoutSeconds: 600
          updatePeriodSeconds: 1
        type: Rolling
      template:
        metadata:
          labels:
            name: basic-app
            deploymentconfig: basic-app
        spec:
          containers:
            - name: mainappcontainer
              image: "python:3.6"
              args:
                - /bin/sh
                - -c
                - >
                  i=0;
                  while true;
                  do
                    echo "$i: $(date)"
                    i=$((i+1));
                    sleep 1;
                  done
      triggers:
        - type: "ConfigChange"
        - type: "ImageChange"
          imageChangeParams:
            automatic: true
            containerNames:
            - "mainappcontainer"
            from:
              kind: "ImageStreamTag"
              name: "python:3.6"
              namespace: openshift
      revisionHistoryLimit: 10
    EOF
    

Load the sidecar container running tcpdump

This section adds the sidecar with support tools (inc. tcpdump) to an application already existing in the cluster.

  1. Grant anyuid to the default service account of the project:

    Note: This is a security risk as it increase the attack surface for the application, but only minimally as it allows the application root access to the container. This is disabled by default and should be disable once no longer required (least privilege model). As expected tcpdump requires this access to gain privileged access to the containers devices.

    $ oc adm policy add-scc-to-user anyuid -z default -n `oc project -q`
    
  2. Create the script to run on startup:

    Note: This is a basic tcpdump script that should be modified to suit the packet captures required for analysis. For example, adding filters to reduce the noise from the pcaps.

    • Keeping -w /tcpdump/pcaps/trace -W 6 -G 1800 -C 100 in the dump script is a good idea. It will ensure the pod filesystem does not fill up with pcaps by:
      • -G 1800 -C100: Rotate the files every 1800 seconds (30 mins) or when the filesize > 100m Mb, whichever triggers first.
      • -W 6: Rotate the pcap files over only 6 files.
      • -w /tcpdump/pcaps/trace: Call these file trace and locate them in the /tcpdump/pcaps directory.
    • Keeping -Z root provides the command with the additional privileges it requires
    $ cat <<EOF > onstartup.sh 
    #!/bin/sh
    mkdir -p /tcpdump/pcaps
    tcpdump -i eth0 -w /tcpdump/pcaps/trace -W 6 -G 1800 -C 100 -K -n -Z root
    EOF
    
  3. Turn the script into a configmap. Once it is a configmap it can later be mounted in the sidecar container:

    $ oc create configmap cm-onstartup --from-file=onstartup.sh
    
  4. Edit the deploymentconfig of the example application to include the sidecar container. Instead of a manual edit, this could also be done as a patch, if preferred:

        $ oc edit dc basic-app
    
        ## append section B after section A
        ----- A
             spec:
               containers:
        ----- B
               - name: rhtools
                 image: registry.redhat.io/rhel7/support-tools
                 args:
                 - /tcpdump/bin/onstartup.sh
                 volumeMounts:
                 - mountPath: /tcpdump/bin
                   name: onstartup-volume
                 securityContext:
                   capabilities:
                     add:
                     - NET_ADMIN
                     - NET_RAW
        ----- end
    
        ## And append section B after section A
        ----- A
               securityContext: {}
               terminationGracePeriodSeconds: 30
        ----- B
               volumes:
               - configMap:
                   defaultMode: 484
                   name: cm-onstartup
                 name: onstartup-volume
        ----- end
    
  5. Check out the new tcpdump container:

    $ oc rsh -c rhtools <podName>
    
  6. Run the following from inside the container:

        ## Check that tcpdump is running
        # ps -ef
    
        ## This is the script that is running
        # ls  -la /tcpdump/bin/
    
        ## This is where the pcaps are stored
        # ls -la /tcpdump/pcaps
    
  7. The pcaps can be extracted like this

    $ oc rsync -c rhtools <podName>:/tcpdump/pcaps/ .
    $ ls -la
    

Root Cause

This procedure is useful for clusters when the cluster owner is not authorised to run tcpdump at the node level.

Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.