Troubleshooting an applications (images) on OpenShift 3.x

Updated 6 Jan 2019

Troubleshooting applications or images with OpenShift can be broken up into 3 different areas: build errors, deployment errors or applications errors. However, all 3 areas employ the use of common tools to investigate what is happening.

In all the examples that this article provides, we will use the OpenShift CLI for easy data collection. However the same information can be seen on the WebUI under the appropriate sections / headers.
- Note: Many of the commands denoted here are explained in more detail in the troubleshooting and debugging cli operations section of our docs.

The general commands for troubleshooting all application related issues (data collection only commands) come in the form of the following:

# oc  logs <pod>
# oc get all,events,status  -o wide -n <project>
# oc get all -o json

Note: You might also want to review Knowledge Base Solution 3132761 to see key project level information often needed, in debugging an application (commands shown above).

In the 3 Troubleshooting Area's below, we will cover more about when it's appropriate to use each command, and what it helps you collect and see about your application or its state in the cluster.

Troubleshooting Areas

Build Errors

Every software project (in some form or another) starts with a build! With OpenShift or containers the same thing is true, and your software must be built into an image so that it can be deployed or ran. As the focus of this article is to focus on OpenShift builds, and investigating them, its good to start by This page is not included, but the link has been rewritten to point to the nearest parent document.understanding OpenShift builds and how they work.

With most build issues, the error or cause is logged in some way (with standard logs, or debug logs) in the build log, so collecting the build logs are the first step in diagnosing any build-related issues.

# oc logs bc/<build_pod>

Please note that in some cases you may need to alter the build log level to get more/enough information out of the build logs for root cause analysis (RCA).

Deployment Errors

Deployment failures are often cluster / infrastructure related issues (scheduling issues, issues with the container engine, liveness/readiness probe failures, etc). This means as an "end user" to a cluster, you're often only told what the cluster wants you to know about the failure. Thus, you may need to take the information from system events, and pass this off to a "cluster administrator" to review or pull more information out of a cluster (either using debugging commands or reviewing cluster logs).

To get the basic status / errors that the cluster will tell an end user about, you can use one of the following to get event/status information on deployments from the cluster.

# oc get status -o wide -n <project> 
# oc get events -o wide -n <project>

Application Errors

Once your software project is built and deployed, it is going to begin running on the cluster, and with all software there may be bugs / unintended behaviors. To be able to look at how OpenShift starts your application, monitors your application, and records information about the application (as its running) is critical to being able to identify issues with your application.

In most cases, seeing how OpenShift defines the components of your application, and what events/status are associated with your application allow you to determine if the issues you are having are with OpenShift or isolated to what your application provides with its code.

# oc get all,events,status  -o wide -n <project>
# oc get all -o json

If you can isolate operations to your application, reviewing the logs are a common first step in debugging any issue you might be seeing.

# oc  logs pod/<pod>

Depending on what is happening with your application, you might need to employ advanced log collection techniques and review logs from old pods, or follow the logs of the current pod.

Specific OpenShift Image Troubleshooting Examples

Product(s)

Category

Troubleshoot

Article Type

General