Troubleshooting OpenShift Container Platform 3.x: DNS

Solution Verified - Updated 17 Jun 2024

Environment

OpenShift Containger Platform 3.x

Issue

I think that I am having issue resolving from with in my OpenShift Cluster.

Resolution

DNS in OpenShift 3.6+
DNS in OpenShift 3.2 - 3.5
DNS in OpenShift 3.1 and lower
DNS on an Openshift Master
DNS on an OpenShift Node
DNS troubleshooting from the hosts
DNS troubleshooting from a container/pod
###DNS in OpenShift 3.6 {#ose36}

Troubleshooting DNS 3.6+
See the following blog post on detailed information on how dns works with 3.6+ : This content is not included.OCP dns deep dive
Starting in 3.6, dnsmasq is required and new installs will fail if it is disabled.
- In the Ansible hosts file, openshift_node_env_vars should not contain --disable=dns,proxy
- In the Ansible hosts file, openshift_use_dnsmasq must be set to true (this is the default)

DNS in OpenShift 3.2 - 3.5

In OpenShift 3.2 [DNSMASQ was introduced](https://docs.openshift.com/enterprise/3.2/release_notes/ose_3_2_release_notes.html#ose-32-dns-changes).

By default, new nodes installed with OpenShift Enterprise 3.2 will have Dnsmasq installed and configured as the default nameserver for both the host and pods.
By default, new masters installed with OpenShift Enterprise 3.2 will run SkyDNS on port 8053 rather than 53. Network access controls must allow nodes to connect to masters on port 8053. This is necessary so that Dnsmasq may be configured on all nodes.

DNS in OpenShift 3.1 and lower

With OpenShift 3.1 pod and containers running on a node would get SkyDNS's IP appended to the resolv.conf when started. For example is a node had resolv.conf that looked like the following.
- resolv.conf from a Node
```
# cat /etc/resolv.conf 
nameserver 8.8.8.8
nameserver  208.67.222.222
search redhat.com
```

The resolv.conf in a pod/container in OpenShift would look like the below.

resolv.conf from a container

# cat /etc/resolv.conf 
nameserver 172.30.0.1
nameserver 8.8.8.8
nameserver 208.67.222.222
search default.svc.cluster.local svc.cluster.local cluster.local redhat.com
options ndots:5

DNS on an Openshift Master

The master is where SKYDNS will be running on port 8053 by default as to not interfere with DNSMASQ.
# grep dns -A1 /etc/origin/master/master-config.yaml
- If this value is changed the master services will need to be restarted
  systemctl restart atomic-openshift-master*

DNS on an OpenShift Node

To confirm DNSMASQ is working and the node is able to resolve service queries try the following:
```
# dig @127.0.0.1  kubernetes.default.svc.cluster.local  
```
NetworkManager is required and nameservers must be configured by NetworkManager
# nmcli connection show eth0 | grep IP4.DNS
- You might see an error when referencing eth0, if you do, please collect the same information on whatever connections do exist on that system.
- If no dns nameservers are shown, add your external dns nameservers with the following commands per Using the NetworkManager Command Line Tool:
```
       Add new nameservers:
       # nmcli con mod eth0 ipv4.dns "8.8.8.8 8.8.4.4"
       
       Append nameservers:
       # nmcli con mod my-con-em1 +ipv4.dns "8.8.8.8"

        To apply changes after a modified connection using nmcli, activate again the connection by entering this command:
        #nmcli con up con-name
```
A NetworkManager Dispatch script is added to the node during the OpenShift Install. This script configures DNSMASQ, setting the the nodes DNS nameservers as forwarding servers. Also overwrites the nodes resolv.conf to only use itself as a nameserver.
# cat /etc/resolv.conf
# cat /etc/dnsmasq.d/origin-*
- If the node does not have the NetworkManager Dispatch installed please see the solution OpenShift 3 manually adding NetworkManager DNSMASQ dispatcher script
  # ls /etc/NetworkManager/dispatcher.d/99-origin-dns.sh
In the OpenShift node-config.yaml OpenShift passes the value for dnsIP to the pod/container's resolv.conf. This value should be equal to the nodes IP address when using DNSMASQ.

# ip route get 8.8.8.8 | awk 'NR==1 {print $NF}'
# grep dnsIP -A1 /etc/origin/node/node-config.yaml
- If DNSMASQ is not being used either set to kubernetes service IP which is usually 172.30.0.1, or remove entirely as it defaults to this value.
In some versions the following variables shall also be set in node-config.yaml:
dnsBindAddress: 127.0.0.1:53
dnsRecursiveResolvConf: /etc/origin/node/resolv.conf

Troubleshooting

If opening up at This content is not included.support ticket please provide or be prepared to provide the following:

Login in as system:admin on the master host
# oc login -u system:admin --config=/etc/origin/master/admin.kubeconfig
Collect the following from the master:

# oadm diagnostics 

# netstat -tupln | grep -e 53 -e 8053
# cat /etc/resolv.conf
# oc describe endpoints kubernetes -n default
# grep dns -A1 /etc/origin/master/master-config.yaml 

# for ip in $(oc get endpoints kubernetes -n default \
-o 'jsonpath={.subsets[*].addresses[*].ip}'); \
do dig -p $(oc get endpoints kubernetes -n default \
-o 'jsonpath={.subsets[*].ports[?(@.name=="dns")].port}') \
@$ip kubernetes.default.svc.cluster.local +short ; done

Please also collect the debug script, as described in our documentation
Remote to a OpenShift Node and collect the following:

# systemctl status dnsmasq
# netstat -tupla | grep 53
# nmcli connection show eth0 | grep IP4.DNS
# cat /etc/resolv.conf
# cat /etc/dnsmasq.d/origin-*
# ls -la /etc/NetworkManager/dispatcher.d/99-origin-dns.sh
# grep dnsIP -A1 /etc/origin/node/node-config.yaml
# ip route get 8.8.8.8 | awk 'NR==1 {print $NF}'

You might see an error when referencing eth0, if you do, please collect the same information on whatever connections do exist on that system.

DNS troubleshooting a from container/pod

# docker ps
# cid=<docker-container-id>
# sudo nsenter -n -i -p -t  $(sudo docker inspect --format "{{ .State.Pid }}" "$cid")  <<NSEOF
dig kubernetes.default.svc.cluster.local +short
dig access.redhat.com  +short
cat /etc/resolv.conf
NSEOF

SBR

Shift

Product(s)

Red Hat OpenShift Container Platform

Components

kubernetes

Category

Troubleshoot

Tags

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.