Failed to access Quay

Solution Verified - Updated

Environment

  • Red Hat Quay
    • 3.x

Issue

  • Failed to pull/push images from Quay registry
  • Can't access Quay web UI

Resolution

  • Increase the database connection count that is present in file /var/lib/pgsql/data/postgresql.conf under the max_connections variable. Consult with your database team before making any changes.
max_connections = 100           # (change requires restart) 
  • Recommended number of connections on database is at least 1000 for development cluster and 2000 for production. More if needed (depending on one's quay load).

  • As an immediate solution one can force restart quay container. The restart of the container helps because on each restart Quay creates a new service key. These keys have a lifespan of 2 hours and are regularly rotated.

Root Cause

  • When database connections are exhausted it influences the service key renewal that is used for internal communication and signing of requests made to the Docker v2 API.
  • When this issue occurs, the registry will fall back to Docker v1 API which has been deprecated for a very long time and should not be used. Due to this we see 500 HTTP code for quay UI access.

Diagnostic Steps

  • Check if quay has below service key renewal issue in its debug logs.
gunicorn-registry stdout | 2022-11-08 06:23:38,284 [579] [ERROR] [auth.registry_jwt_auth] Invalid bearer token: Unknown service key
gunicorn-registry stdout | 2022-11-08 06:23:38,286 [579] [INFO] [gunicorn.access] 103.241.xx.xx - - [08/Nov/2022:06:23:38 +0000] "GET /v2/pidadmin/fluent-bit/manifests/2.1 HTTP/1.1" 401 32 "-" "cri-o/1.19.3-13.rhaos4.6.git99373fe.el8 go/go1.15.14 os/linux arch/amd64"
gunicorn-registry stdout | 2022-11-08 06:23:45,934 [578] [INFO] [gunicorn.access] 103.241.xx.xx - - [08/Nov/2022:06:23:45 +0000] "GET /v2/ HTTP/1.1" 401 4 "-" "cri-o/1.19.3-13.rhaos4.6.git99373fe.el8 go/go1.15.14 os/linux arch/amd64"
gunicorn-registry stdout | 2022-11-08 06:23:46,243 [579] [INFO] [gunicorn.access] 103.241.xx.xx - quay_root [08/Nov/2022:06:23:46 +0000] "GET /v2/auth?account=quay_root&scope=repository%3Apidadmin%2Ffluent-bit%3Apull&service=quay-prod HTTP/1.1" 200 1047 "-" "cri-o/1.19.3-13.rhaos4.6.git99373fe.el8 go/go1.15.14 os/linux arch/amd64"
gunicorn-registry stdout | 2022-11-08 06:23:46,268 [579] [ERROR] [util.security.registry_jwt] Could not find requested service key 12c9bf89c272b019d3849ef6621dc4a3c934d4719a563aa65efac262fe0a612f with encoded JWT: eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCIsImtpZCI6IjEyYzliZjg5YzI3MmIwMTlkMzg0OWVmNjYyMWRjNGEzYzkzNGQ0NzE5YTU2M2FhNjVlZmFjMjYyZmUwYTYxMmm......
gunicorn-registry stdout | 2022-11-08 06:23:46,512 [579] [ERROR] [auth.registry_jwt_auth] Invalid bearer token: Unknown service key
gunicorn-registry stdout | 2022-11-08 06:23:46,513 [579] [ERROR] [util.http] Error 401: Unknown service key; Arguments: {'url': u'https://quay-prod/v2/library/alpine/manifests/latest', 'manifest_ref': u'latest', 'message': 'Unknown service key', 'repository': u'library/alpine', 'status_code': 401}
  • Above usually is influenced by postgres db connections being exhausted
216627 OperationalError: (1040, u'Too many connections')
216628 buildlogsarchiver stdout | 2022-11-08 04:33:28,976 [170] [ERROR] [workers.worker] Operation raised exception
216629 Traceback (most recent call last):
216630   File "workers/worker.py", line 81, in _operation_func
216631     return operation_func()
216632   File "/quay-registry/workers/buildlogsarchiver/buildlogsarchiver.py", line 28, in _archive_redis_buildlogs
216633     to_archive = model.get_archivable_build()
216634   File "workers/buildlogsarchiver/models_pre_oci.py", line 7, in get_archivable_build
.
.
216653     cursor = self.cursor()
216654   File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/peewee.py", line 2987, in cursor
216655     self.connect()
216656   File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/peewee.py", line 2947, in connect
216657     self._initialize_connection(self._state.conn)
216658   File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/peewee.py", line 2783, in __exit__
216659     reraise(new_type, new_type(*exc_args), traceback)
216660   File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/peewee.py", line 2944, in connect
216661     self._state.set_connection(self._connect())
216662   File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/peewee.py", line 3828, in _connect
216663     conn = mysql.connect(db=self.database, **self.connect_params)
216664   File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/pymysql/__init__.py", line 94, in Connect
216665     return Connection(*args, **kwargs)
216666   File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/pymysql/connections.py", line 325, in __init__
216667     self.connect()
216668   File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/pymysql/connections.py", line 599, in connect
216669     self._request_authentication()
216670   File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/pymysql/connections.py", line 861, in _request_authentication
216671     auth_packet = self._read_packet()
216672   File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/pymysql/connections.py", line 684, in _read_packet
216673     packet.check_error()
216674   File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/pymysql/protocol.py", line 220, in check_error
216675     err.raise_mysql_exception(self._data)
216676   File "/opt/rh/python27/root/usr/lib/python2.7/site-packages/pymysql/err.py", line 109, in raise_mysql_exception
216677     raise errorclass(errno, errval) 
216678 OperationalError: (1040, u'Too many connections')
  • Check Quay debug logs for v1 API getting used and for 500 errors on web UI worker:
gunicorn-registry stdout | 2022-11-08 06:53:01,081 [558] [INFO] [gunicorn.access] 172.16.100.129 - - [08/Nov/2022:06:53:01 +0000] "GET /v1/images/8eaef42f4d5685b892a33790487e1fef42d159a344a05bd7fdca493972eab7a1/json HTTP/1.1" 200 299 "-" "docker/1.13.1 go/go1.10.3 kernel/3.10.0-862.32.1.el7.x86_64 os/linux arch/amd64 UpstreamClient(Go-http-client/1.1)"
gunicorn-registry stdout | 2022-11-08 06:53:01,104 [575] [INFO] [gunicorn.access] 172.16.100.129 - - [08/Nov/2022:06:53:01 +0000] "GET /v1/images/c511a91a7a97c0f28c751cfbd007f7c1d79aaa7f2f1ae070f590987bb7218bda/json HTTP/1.1" 200 1297 "-" "docker/1.13.1 go/go1.10.3 kernel/3.10.0-862.32.1.el7.x86_64 os/linux arch/amd64 UpstreamClient(Go-http-client/1.1)"
gunicorn-registry stdout | 2022-11-08 06:53:01,129 [578] [INFO] [gunicorn.access] 172.16.100.129 - - [08/Nov/2022:06:53:01 +0000] "GET /v1/images/cd62eed07d6eb7c5066d2e5970766455a2bc463b2391a11fa671e0626dcb8f76/json HTTP/1.1" 200 523 "-" "docker/1.13.1 go/go1.10.3 kernel/3.10.0-862.32.1.el7.x86_64 os/linux arch/amd64 UpstreamClient(Go-http-client/1.1)"

----------------------------

gunicorn-web stdout | 2022-11-08 07:20:22,424 [546] [ERROR] [gunicorn.error] Error handling request /api/v1/user/
gunicorn-web stdout | 2022-11-08 07:20:22,424 [546] [INFO] [gunicorn.access]  - - [08/Nov/2022:07:20:22 +0000] "GET /api/v1/user/ HTTP/1.0" 500 0 "-" "-"
gunicorn-web stdout | 2022-11-08 07:20:22,474 [546] [ERROR] [gunicorn.error] Error handling request /api/v1/messages
gunicorn-web stdout | 2022-11-08 07:20:22,474 [546] [INFO] [gunicorn.access]  - - [08/Nov/2022:07:20:22 +0000] "GET /api/v1/messages HTTP/1.0" 500 0 "-" "-"
Product(s)
Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.