MetalLB Operator version 4.18 is incorrectly installed on OpenShift 4.17 or older clusters

Solution Verified - Updated

Environment

  • Red Hat OpenShift Container Platform (RHOCP)
    • < 4.18
  • MetalLB Operator 4.18

Issue

After a catalog update, the MetalLB Operator may have been upgraded to version 4.18, which is not supported on OpenShift Container Platform (OCP) versions 4.17 or older.

This article covers the specific remediation for the MetalLB Operator. For the general issue affecting multiple operators, refer to Red Hat Operator has version higher than the cluster version.

Symptoms

Red Hat OpenShift Container Platform 4.17

Administrators can see that the 4.18 MetalLB operator is installed:

$ oc get csv -n metallb-system
NAME                                    DISPLAY            VERSION               REPLACES                                PHASE
metallb-operator.v4.18.0-202603040208   MetalLB Operator   4.18.0-202603040208   metallb-operator.v4.17.0-202602251541   Succeeded

Red Hat OpenShift Container Platform 4.16 and 4.14

Administrators can see that the 4.18 MetalLB operator is installed:

$ oc get csv -n metallb-system
NAME                                                                               DISPLAY            VERSION               REPLACES                                PHASE
metallb-operator.v4.18.0-202601302238   MetalLB Operator   4.18.0-202601302238   metallb-operator.v4.16.0-202601200116   Succeeded

Administrators will also see that frr-k8s-... pods are running:

$ oc get pods -n metallb-system 
NAME                                                              READY   STATUS      RESTARTS   AGE
controller-69d58c4b4c-tscrp                                       2/2     Running     0          3m55s
frr-k8s-64wkj                                                     6/6     Running     0          3m55s
frr-k8s-dx5w5                                                     6/6     Running     0          3m55s
frr-k8s-fpm9h                                                     6/6     Running     0          3m55s
frr-k8s-h6c6g                                                     6/6     Running     0          3m55s
frr-k8s-hxmlm                                                     6/6     Running     0          3m55s
frr-k8s-jbbh6                                                     6/6     Running     0          3m55s
frr-k8s-webhook-server-657f64d6-x8bcn                             1/1     Running     0          3m55s
metallb-operator-controller-manager-5d488bdd58-qtfv5              1/1     Running     0          4m3s
metallb-operator-webhook-server-7b577754b-rdnz6                   1/1     Running     0          4m3s
speaker-4lnbk                                                     2/2     Running     0          2m22s
speaker-54bmk                                                     2/2     Running     0          2m33s
speaker-kjz8j                                                     2/2     Running     0          2m53s
speaker-l7rzd                                                     2/2     Running     0          3m14s
speaker-np7q2                                                     2/2     Running     0          3m55s
speaker-ntmf5                                                     2/2     Running     0          3m34s

Red Hat OpenShift Container Platform 4.12

The rollout of version 4.18 of the MetalLB operator is stuck in an intermediate state: Administrators will see that the 4.18 MetalLB operator CSV is Pending installation, and the 4.12 CSV is in Replacing state:

$ oc get csv,installplan -n metallb-system
NAME                                                                               DISPLAY            VERSION               REPLACES                               PHASE
clusterserviceversion.operators.coreos.com/metallb-operator.4.12.0-202602021306    MetalLB Operator   4.12.0-202602021306                                          Replacing
clusterserviceversion.operators.coreos.com/metallb-operator.v4.18.0-202601302238   MetalLB Operator   4.18.0-202601302238   metallb-operator.4.12.0-202602021306   Pending

NAME                                             CSV                                     APPROVAL    APPROVED
installplan.operators.coreos.com/install-5nt5m   metallb-operator.4.12.0-202602021306    Automatic   true
installplan.operators.coreos.com/install-mfcqm   metallb-operator.v4.18.0-202601302238   Automatic   true

However, the running pods should remain unaffected:

$ oc get pods -n metallb-system
NAME                                                              READY   STATUS      RESTARTS   AGE
a9b3ed1fe9273b725119dcfb777257f08e39bbefccdf592dce2d0dc2139llqw   0/1     Completed   0          12m
controller-6c44cdbb54-c8cvn                                       2/2     Running     0          15m
metallb-operator-controller-manager-5f9bf6b6d9-b5t78              1/1     Running     0          16m
metallb-operator-webhook-server-6dd5595446-jbvkl                  1/1     Running     0          16m
speaker-jsrdd                                                     6/6     Running     0          15m
speaker-krfsh                                                     6/6     Running     0          15m
speaker-n5ljf                                                     6/6     Running     0          15m
speaker-r84fj                                                     6/6     Running     0          15m
speaker-srrkh                                                     6/6     Running     0          15m
speaker-xn5fs                                                     6/6     Running     0          15m

Resolution

The remediation strategy is to perform a targeted downgrade by uninstalling the 4.18 MetalLB Operator and reinstalling the version compatible with your cluster.

Note: This solution provides detailed steps for EUS versions 4.12, 4.14, and 4.16. For non-EUS versions, resolution steps should be similar, but symptoms and system state may slightly vary.
CAUTION: This procedure involves at least one, but potentially two disruptive stages:

  1. When the 4.18 CSV is removed from the cluster, any resources owned by the CSV will be deleted, including the metallb-operator-controller-manager and metallb-operator-webhook-server Deployments, MetalLB ServiceAccounts, Roles, and RoleBindings. The speaker pods will remain running, maintaining active BGP and L2 sessions, but resources will no longer be reconciled.
  2. After recreating the original version subscription, all resources (Deployments, Secrets, Roles, RoleBindings, and the speaker DaemonSet) will be recreated. This recreation will bring all components back into a working state, but will cause a restart of the speaker pods and thus an interruption of BGP and L2 services.

Note: MetalLB deployed from 4.18 remains functional on OpenShift 4.14/4.16, but will not receive subsequent security updates, bug fixes, or feature enhancements. Downgrading MetalLB to match your OpenShift cluster version is mandatory for production use; otherwise, this configuration is not supported.

Remediation

Red Hat OpenShift Container Platform 4.17

Delete the current subscription and CSV (named metallb-operator.v4.18.0…). After this, the subscriptions, CSVs, and installplans should be removed for the metallb-operator:

$ oc delete subscription -n metallb-system metallb-operator-sub
$ oc delete csv -n metallb-system metallb-operator.v4.18.0-<...>
$ oc get sub,csv,installplan -n metallb-system
No resources found in metallb-system namespace.

Note: Disruptive stage 1.

MetalLB controller-... and speaker-... pods should persist and should not change; administrators will also still see the frr-k8s-... pods; note however that the metallb-operator-controller-manager-... and metallb-operator-webhook-server-... pods will be deleted:

$ oc get pods -n metallb-system
NAME                                    READY   STATUS    RESTARTS   AGE
controller-69d58c4b4c-tscrp             2/2     Running   0          6m16s
frr-k8s-64wkj                           6/6     Running   0          6m16s
frr-k8s-dx5w5                           6/6     Running   0          6m16s
frr-k8s-fpm9h                           6/6     Running   0          6m16s
frr-k8s-h6c6g                           6/6     Running   0          6m16s
frr-k8s-hxmlm                           6/6     Running   0          6m16s
frr-k8s-jbbh6                           6/6     Running   0          6m16s
frr-k8s-webhook-server-657f64d6-x8bcn   1/1     Running   0          6m16s
speaker-4lnbk                           2/2     Running   0          4m43s
speaker-54bmk                           2/2     Running   0          4m54s
speaker-kjz8j                           2/2     Running   0          5m14s
speaker-l7rzd                           2/2     Running   0          5m35s
speaker-np7q2                           2/2     Running   0          6m16s
speaker-ntmf5                           2/2     Running   0          5m55s

Now, reinstall the subscription:

$ cat <<'EOF' | oc apply -f - 
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: metallb-operator-sub
  namespace: metallb-system
spec:
  channel: stable
  name: metallb-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
EOF

Note: Red Hat recommends creating subscriptions with .spec.installPlanApproval: Manual and manually patching the suggested install plan after a thorough review of the suggested operator upgrades with oc patch installplan -n metallb-system install-<suffix> --type merge --patch '{"spec":{"approved":true}}'
Note: You may want to choose a startingCSV: although this is not strictly necessary.
Note: The metallb-operator only has a single channel name, regardless of OpenShift version: stable
Note: Disruptive state 2. After creating the original version subscription, all resources (Deployments, Secrets, Roles, RoleBindings, and the speaker DaemonSet) will be recreated. This recreation will bring all components back into a working state, but will cause a recreation of the speaker pods and thus may cause an interruption of BGP and L2 services.

Wait until the subscription and CSV indicate that version 4.17 has rolled out correctly.

$  oc get sub,csv,installplan -n metallb-system
NAME                                                     PACKAGE            SOURCE             CHANNEL
subscription.operators.coreos.com/metallb-operator-sub   metallb-operator   redhat-operators   stable

NAME                                                                               DISPLAY            VERSION               REPLACES   PHASE
clusterserviceversion.operators.coreos.com/metallb-operator.v4.17.0-202602251541   MetalLB Operator   4.17.0-202602251541              Succeeded

NAME                                             CSV                                     APPROVAL    APPROVED
installplan.operators.coreos.com/install-22bjz   metallb-operator.v4.17.0-202602251541   Automatic   true

After downgrade:

$ oc get pods -n metallb-system
NAME                                                  READY   STATUS    RESTARTS   AGE
controller-5f85dbcbcc-hwnfd                           2/2     Running   0          2m48s
frr-k8s-27jms                                         6/6     Running   0          86s
frr-k8s-lrk8j                                         6/6     Running   0          2m48s
frr-k8s-pp4mk                                         6/6     Running   0          2m7s
frr-k8s-vtf7n                                         6/6     Running   0          66s
frr-k8s-webhook-server-64f54bfc9b-rl6kl               1/1     Running   0          2m48s
frr-k8s-wqwtm                                         6/6     Running   0          2m27s
frr-k8s-xkqw5                                         6/6     Running   0          106s
metallb-operator-controller-manager-7584859fc-p7szf   1/1     Running   0          3m5s
metallb-operator-webhook-server-7f4b998c85-k2ms6      1/1     Running   0          3m5s
speaker-9s6dx                                         2/2     Running   0          2m16s
speaker-fsq9x                                         2/2     Running   0          2m6s
speaker-h4gxh                                         2/2     Running   0          106s
speaker-jh7kd                                         2/2     Running   0          2m37s
speaker-r66bf                                         2/2     Running   0          2m48s
speaker-sx6hq                                         2/2     Running   0          2m27s

No further steps are necessary on Red Hat OpenShift Container Platform 4.17.

Red Hat OpenShift Container Platform 4.16

Delete the current subscription and CSV (named metallb-operator.v4.18.0…). After this, the subscriptions, CSVs, and installplans should be removed for the metallb-operator:

$ oc delete subscription -n metallb-system metallb-operator-sub
$ oc delete csv -n metallb-system metallb-operator.v4.18.0-<...>
$ oc get sub,csv,installplan -n metallb-system
No resources found in metallb-system namespace.

Note: Disruptive stage 1.

MetalLB controller-... and speaker-... pods should persist and should not change; administrators will also still see the frr-k8s-... pods; note however that the metallb-operator-controller-manager-... and metallb-operator-webhook-server-... pods will be deleted:

$ oc get pods -n metallb-system
NAME                                    READY   STATUS    RESTARTS   AGE
controller-69d58c4b4c-tscrp             2/2     Running   0          6m16s
frr-k8s-64wkj                           6/6     Running   0          6m16s
frr-k8s-dx5w5                           6/6     Running   0          6m16s
frr-k8s-fpm9h                           6/6     Running   0          6m16s
frr-k8s-h6c6g                           6/6     Running   0          6m16s
frr-k8s-hxmlm                           6/6     Running   0          6m16s
frr-k8s-jbbh6                           6/6     Running   0          6m16s
frr-k8s-webhook-server-657f64d6-x8bcn   1/1     Running   0          6m16s
speaker-4lnbk                           2/2     Running   0          4m43s
speaker-54bmk                           2/2     Running   0          4m54s
speaker-kjz8j                           2/2     Running   0          5m14s
speaker-l7rzd                           2/2     Running   0          5m35s
speaker-np7q2                           2/2     Running   0          6m16s
speaker-ntmf5                           2/2     Running   0          5m55s

Now, reinstall the subscription:

$ cat <<'EOF' | oc apply -f - 
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: metallb-operator-sub
  namespace: metallb-system
spec:
  channel: stable
  name: metallb-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
EOF

Note: Red Hat recommends creating subscriptions with .spec.installPlanApproval: Manual and manually patching the suggested install plan after a thorough review of the suggested operator upgrades with oc patch installplan -n metallb-system install-<suffix> --type merge --patch '{"spec":{"approved":true}}'
Note: You may want to choose a startingCSV: although this is not strictly necessary.
Note: The metallb-operator only has a single channel name, regardless of OpenShift version: stable
Note: Disruptive state 2. After creating the original version subscription, all resources (Deployments, Secrets, Roles, RoleBindings, and the speaker DaemonSet) will be recreated. This recreation will bring all components back into a working state, but will cause a recreation of the speaker pods and thus may cause an interruption of BGP and L2 services.

Wait until the subscription and CSV indicate that version 4.16 has rolled out correctly.

$ oc get sub,csv,installplan -n metallb-system
NAME                                                     PACKAGE            SOURCE             CHANNEL
subscription.operators.coreos.com/metallb-operator-sub   metallb-operator   redhat-operators   stable

NAME                                                                               DISPLAY            VERSION               REPLACES   PHASE
clusterserviceversion.operators.coreos.com/metallb-operator.v4.16.0-202601200116   MetalLB Operator   4.16.0-202601200116              Succeeded

NAME                                             CSV                                     APPROVAL    APPROVED
installplan.operators.coreos.com/install-4xtck   metallb-operator.v4.16.0-202601200116   Automatic   true

On OpenShift Container Platform Version 4.16, when the original 4.16 subscription is reinstalled, the OLM will take care of removing the superfluous frr-k8s components from the cluster. Make sure that the frr-k8s-... pods are removed and wait until the controller, metallb-operator-... and speaker-... pods are rolled out again. Before downgrade:

$ oc get pods -n metallb-system
NAME                                                              READY   STATUS      RESTARTS   AGE
controller-69d58c4b4c-7w7mk                                       2/2     Running     0          5m32s
frr-k8s-69tpl                                                     6/6     Running     0          5m32s
frr-k8s-7wz8g                                                     6/6     Running     0          5m32s
frr-k8s-b9ml4                                                     6/6     Running     0          5m32s
frr-k8s-fjp4r                                                     6/6     Running     0          5m32s
frr-k8s-mr6kv                                                     6/6     Running     0          5m32s
frr-k8s-qcrm6                                                     6/6     Running     0          5m32s
frr-k8s-webhook-server-657f64d6-jvkdp                             1/1     Running     0          5m32s
metallb-operator-controller-manager-7b7f994f99-6prlk              0/1     Running     0          10s
metallb-operator-webhook-server-67f8cdd58c-plf7l                  0/1     Running     0          9s
speaker-4j9j8                                                     2/2     Running     0          4m29s
speaker-95vqk                                                     2/2     Running     0          5m11s
speaker-cs7rv                                                     2/2     Running     0          4m50s
speaker-n8w6d                                                     2/2     Running     0          4m9s
speaker-sblmz                                                     2/2     Running     0          5m32s
speaker-xj2sk                                                     2/2     Running     0          3m48s

After downgrade:

$  oc get pods -n metallb-system
NAME                                                   READY   STATUS    RESTARTS   AGE
controller-5566d6c698-62pv9                            2/2     Running   0          2m55s
metallb-operator-controller-manager-7b7f994f99-6prlk   1/1     Running   0          3m14s
metallb-operator-webhook-server-67f8cdd58c-plf7l       1/1     Running   0          3m13s
speaker-8bh8p                                          6/6     Running   0          112s
speaker-8lkzd                                          6/6     Running   0          2m34s
speaker-crfgv                                          6/6     Running   0          2m14s
speaker-f8p4l                                          6/6     Running   0          92s
speaker-qttkj                                          6/6     Running   0          71s
speaker-wlr5j                                          6/6     Running   0          2m55s

Delete the superfluous servicel2statuses CRD:

$ oc delete crd servicel2statuses.metallb.io

OpenShift Container Platform Version 4.14

Delete the current subscription and CSV (named metallb-operator.v4.18.0…). After this, the subscriptions, CSVs, and installplans should be removed for the metallb-operator:

$ oc delete subscription -n metallb-system metallb-operator-sub
$ oc delete csv -n metallb-system metallb-operator.v4.18.0-<...>
$ oc get sub,csv,installplan -n metallb-system
No resources found in metallb-system namespace.

Note: Disruptive stage 1.

MetalLB controller-... and speaker-... pods should persist and should not change; administrators will also still see the frr-k8s-... pods; note however that the metallb-operator-controller-manager-... and metallb-operator-webhook-server-... pods will be deleted:

$ oc get pods -n metallb-system
NAME                                    READY   STATUS    RESTARTS   AGE
controller-69d58c4b4c-tscrp             2/2     Running   0          6m16s
frr-k8s-64wkj                           6/6     Running   0          6m16s
frr-k8s-dx5w5                           6/6     Running   0          6m16s
frr-k8s-fpm9h                           6/6     Running   0          6m16s
frr-k8s-h6c6g                           6/6     Running   0          6m16s
frr-k8s-hxmlm                           6/6     Running   0          6m16s
frr-k8s-jbbh6                           6/6     Running   0          6m16s
frr-k8s-webhook-server-657f64d6-x8bcn   1/1     Running   0          6m16s
speaker-4lnbk                           2/2     Running   0          4m43s
speaker-54bmk                           2/2     Running   0          4m54s
speaker-kjz8j                           2/2     Running   0          5m14s
speaker-l7rzd                           2/2     Running   0          5m35s
speaker-np7q2                           2/2     Running   0          6m16s
speaker-ntmf5                           2/2     Running   0          5m55s

Now, reinstall the subscription:

$ cat <<'EOF' | oc apply -f - 
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: metallb-operator-sub
  namespace: metallb-system
spec:
  channel: stable
  name: metallb-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
EOF

Note: Red Hat recommends creating subscriptions with .spec.installPlanApproval: Manual and manually patching the suggested install plan after a thorough review of the suggested operator upgrades with oc patch installplan -n metallb-system install-<suffix> --type merge --patch '{"spec":{"approved":true}}'
Note: You may want to choose a startingCSV: although this is not strictly necessary.
Note: The metallb-operator only has a single channel name, regardless of OpenShift version: stable
Note: Disruptive state 2. After creating the original version subscription, all resources (Deployments, Secrets, Roles, RoleBindings, and the speaker DaemonSet) will be recreated. This recreation will bring all components back into a working state, but will cause a recreation of the speaker pods and thus may cause an interruption of BGP and L2 services.

Wait until the subscription and installplan indicate that version 4.14 has rolled out correctly.

$ oc get sub,csv,installplan -n metallb-system
NAME                                                     PACKAGE            SOURCE             CHANNEL
subscription.operators.coreos.com/metallb-operator-sub   metallb-operator   redhat-operators   stable

NAME                                                                               DISPLAY            VERSION               REPLACES   PHASE
clusterserviceversion.operators.coreos.com/metallb-operator.v4.14.0-202601210114   MetalLB Operator   4.14.0-202601210114              Succeeded

NAME                                             CSV                                     APPROVAL    APPROVED
installplan.operators.coreos.com/install-llc76   metallb-operator.v4.14.0-202601210114   Automatic   true

Wait until the controller, metallb-operator-... and speaker-... pods are rolled out again. Before downgrade:

$  oc get pods -n metallb-system
NAME                                                              READY   STATUS      RESTARTS   AGE
controller-666c6fc987-mdjqv                                       2/2     Running     0          9m49s
frr-k8s-5blkw                                                     6/6     Running     0          9m50s
frr-k8s-btb67                                                     6/6     Running     0          9m50s
frr-k8s-c2n79                                                     6/6     Running     0          9m50s
frr-k8s-n7xf9                                                     6/6     Running     0          9m50s
frr-k8s-twprl                                                     6/6     Running     0          9m50s
frr-k8s-vmbw5                                                     6/6     Running     0          9m50s
frr-k8s-webhook-server-6c4d448f8d-vncgk                           1/1     Running     0          9m50s
speaker-8mtsg                                                     2/2     Running     0          8m27s
speaker-8vdh2                                                     2/2     Running     0          8m7s
speaker-gjl56                                                     2/2     Running     0          8m48s
speaker-nwjb6                                                     2/2     Running     0          9m19s
speaker-vmrcz                                                     2/2     Running     0          9m49s
speaker-x6fxn                                                     2/2     Running     0          8m58s

After downgrade (note the newer timestamps):

$  oc get pods -n metallb-system
NAME                                                  READY   STATUS    RESTARTS   AGE
controller-788574f66d-h2v9c                           2/2     Running   0          2m30s
frr-k8s-5blkw                                         6/6     Running   0          13m
frr-k8s-btb67                                         6/6     Running   0          13m
frr-k8s-c2n79                                         6/6     Running   0          13m
frr-k8s-n7xf9                                         6/6     Running   0          13m
frr-k8s-twprl                                         6/6     Running   0          13m
frr-k8s-vmbw5                                         6/6     Running   0          13m
frr-k8s-webhook-server-6c4d448f8d-vncgk               1/1     Running   0          13m
metallb-operator-controller-manager-c64f699b4-hk2tm   1/1     Running   0          2m49s
metallb-operator-webhook-server-59cbfd9896-kbcbz      1/1     Running   0          2m49s
speaker-7bt5n                                         6/6     Running   0          109s
speaker-7pv2x                                         6/6     Running   0          2m9s
speaker-9zj22                                         6/6     Running   0          2m30s
speaker-h2xtj                                         6/6     Running   0          88s
speaker-j6knt                                         6/6     Running   0          47s
speaker-z4kfv                                         6/6     Running   0          68s

On OpenShift Container Platform Version 4.14, when the original 4.14 subscription is reinstalled, the OLM will not automatically remove the superfluous frr-k8s components from the cluster. Administrators will need to remove the frr-k8s components manually:

$ oc delete daemonset -n metallb-system frr-k8s
$ oc delete deployment -n metallb-system frr-k8s-webhook-server

Delete the superfluous CRDs:

$ oc delete crd servicel2statuses.metallb.io
$ oc delete crd frrnodestates.frrk8s.metallb.io
$ oc delete crd frrconfigurations.frrk8s.metallb.io

OpenShift Container Platform Version 4.12

Delete the current subscription and CSV (named metallb-operator.v4.18.0…):

$ oc delete subscription -n metallb-system metallb-operator-sub
$ oc delete csv -n metallb-system metallb-operator.v4.18.0-<...>

The 4.12 CSV will still show as Replacing; therefore, delete this one as well:

$ oc get sub,csv,installplan -n metallb-system
NAME                                                                              DISPLAY            VERSION               REPLACES   PHASE
clusterserviceversion.operators.coreos.com/metallb-operator.4.12.0-202602021306   MetalLB Operator   4.12.0-202602021306              Replacing

$ oc delete csv -n metallb-system metallb-operator.4.12.0-202602021306
clusterserviceversion.operators.coreos.com "metallb-operator.4.12.0-202602021306" deleted

After this, the subscriptions, CSVs, and installplans should be removed for the metallb-operator.

$ oc get sub,csv,installplan -n metallb-system
No resources found in metallb-system namespace.

Note: Disruptive step 1. When the 4.18 and 4.12 CSVs are removed from the cluster, any resources owned by the CSVs will be deleted, including the metallb-operator-controller-manager and metallb-operator-webhook-server Deployments, MetalLB ServiceAccounts, Roles, and RoleBindings. The speaker pods will remain running, maintaining active BGP and L2 sessions.

MetalLB controller-... and speaker-... resources should persist and should not change; note however that the metallb-operator-controller-manager-... and metallb-operator-webhook-server-... pods will be deleted.

$ oc get pods -n metallb-system
NAME                                                              READY   STATUS      RESTARTS   AGE
controller-6c44cdbb54-c8cvn                                       2/2     Running     0          18m
speaker-jsrdd                                                     6/6     Running     0          18m
speaker-krfsh                                                     6/6     Running     0          18m
speaker-n5ljf                                                     6/6     Running     0          18m
speaker-r84fj                                                     6/6     Running     0          18m
speaker-srrkh                                                     6/6     Running     0          18m
speaker-xn5fs                                                     6/6     Running     0          18m

Now, reinstall the subscription:

$ cat <<'EOF' | oc apply -f -
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: metallb-operator-sub
  namespace: metallb-system
spec:
  channel: stable
  name: metallb-operator
  source: redhat-operators
  sourceNamespace: openshift-marketplace
EOF

Note: Red Hat recommends creating subscriptions with .spec.installPlanApproval: Manual and manually patching the suggested install plan after a thorough review of the suggested operator upgrades with oc patch installplan -n metallb-system install-<suffix> --type merge --patch '{"spec":{"approved":true}}'
Note: You may want to choose a startingCSV: although this is not strictly necessary.
Note: The metallb-operator only has a single channel name, regardless of OpenShift version: stable
Note: Disruptive step 2. After creating the original version subscription, resources (Deployments, Secrets, Roles, RoleBindings) will be recreated. The speaker DaemonSet will only be recreated if the new 4.12 version is newer than the previously installed 4.12 version. In such a scenario, L3/L2 sessions may suffer from temporary disruption.

Wait until the subscription and installplan indicate that version 4.12 has rolled out correctly:

$ oc get sub,csv,installplan -n metallb-system
NAME                                                     PACKAGE            SOURCE                 CHANNEL
subscription.operators.coreos.com/metallb-operator-sub   metallb-operator   redhat-operators-418   stable

NAME                                                                              DISPLAY            VERSION               REPLACES   PHASE
clusterserviceversion.operators.coreos.com/metallb-operator.4.12.0-202602021306   MetalLB Operator   4.12.0-202602021306              Succeeded

NAME                                             CSV                                    APPROVAL    APPROVED
installplan.operators.coreos.com/install-5nt5m   metallb-operator.4.12.0-202602021306   Automatic   true

Wait until the metallb-operator-controller-manager-... and metallb-operator-webhook-server-... pods roll out again.

Delete the superfluous CRDs:

$ oc delete crd servicel2statuses.metallb.io
$ oc delete crd frrnodestates.frrk8s.metallb.io
$ oc delete crd frrconfigurations.frrk8s.metallb.io

Verification

  • Verify that subscriptions, CSVs, and installplans all eventually converge to the correct version. The frr-k8s-... pods should be removed. In the case of Red Hat OpenShift Container Platform 4.12, if the 4.12 CSV remains at the same version as before the attempted upgrade to 4.18, speaker- pods will not be redeployed since the DaemonSet remains unchanged. However, if the 4.12 CSV version was upgraded, pods will be redeployed as well:

    $ oc get sub,csv,installplan -n metallb-system
    NAME                                                     PACKAGE            SOURCE             CHANNEL
    subscription.operators.coreos.com/metallb-operator-sub   metallb-operator   redhat-operators   stable
    
    NAME                                                                               DISPLAY            VERSION               REPLACES   PHASE
    clusterserviceversion.operators.coreos.com/metallb-operator.v4.16.0-202601200116   MetalLB Operator   4.16.0-202601200116              Succeeded
    
    NAME                                             CSV                                     APPROVAL    APPROVED
    installplan.operators.coreos.com/install-4xtck   metallb-operator.v4.16.0-202601200116   Automatic   true
    $ oc get pods -n metallb-system
    NAME                                                   READY   STATUS    RESTARTS   AGE
    controller-5566d6c698-ccc6b                            2/2     Running   0          3m6s
    metallb-operator-controller-manager-7c86745fc9-4b72d   1/1     Running   0          3m25s
    metallb-operator-webhook-server-758b7f88c5-2x5rw       1/1     Running   0          3m25s
    speaker-2nfqp                                          6/6     Running   0          103s
    speaker-5wsvb                                          6/6     Running   0          2m24s
    speaker-64p6k                                          6/6     Running   0          2m3s
    speaker-fp4cd                                          6/6     Running   0          82s
    speaker-j7549                                          6/6     Running   0          3m6s
    speaker-s4mh9                                          6/6     Running   0          2m45s
    
  • Verify L2/L3 services managed by MetalLB.

  • Verify that MetalLB custom resources still exist.

  • If BGP is configured, verify that the FRR configuration is correct by connecting to one of the speaker pods that should announce BGP advertisements to peers, e.g.:

    $ oc exec -n metallb-system speaker-5wsvb -c frr -- vtysh -c 'show run'
    Building configuration...
    
    Current configuration:
    !
    frr version 8.5.3
    frr defaults traditional
    hostname ci-ln-qsx54qb-72292-vbb7z-worker-c-x2dk8
    log file /etc/frr/frr.log informational
    log timestamp precision 3
    no ip forwarding
    no ipv6 forwarding
    service integrated-vtysh-config
    !
    router bgp 64512
     no bgp ebgp-requires-policy
     no bgp suppress-duplicates
     no bgp hard-administrative-reset
     no bgp default ipv4-unicast
     no bgp graceful-restart notification
     no bgp network import-check
     neighbor 172.30.0.3 remote-as 64512
     neighbor 172.30.0.3 port 180
     neighbor 172.30.0.3 timers 30 90
     neighbor 2000::1 remote-as 64512
     neighbor 2000::1 timers 30 90
     !
     address-family ipv4 unicast
      network 198.51.100.0/32
      network 198.51.100.1/32
      neighbor 172.30.0.3 activate
      neighbor 172.30.0.3 route-map 172.30.0.3-in in
      neighbor 172.30.0.3 route-map 172.30.0.3-out out
      neighbor 2000::1 activate
      neighbor 2000::1 route-map 2000::1-in in
      neighbor 2000::1 route-map 2000::1-out out
     exit-address-family
     !
     address-family ipv6 unicast
      neighbor 172.30.0.3 activate
      neighbor 172.30.0.3 route-map 172.30.0.3-in in
      neighbor 172.30.0.3 route-map 172.30.0.3-out out
      neighbor 2000::1 activate
      neighbor 2000::1 route-map 2000::1-in in
      neighbor 2000::1 route-map 2000::1-out out
     exit-address-family
    exit
    (...)
    
  • Verify that the deleted CRDs no longer exist:

    • On 4.16, administrators should see:

      $ oc get crd | grep -iE 'metallb|frr'
      bfdprofiles.metallb.io 2026-02-16T10:16:54Z
      bgpadvertisements.metallb.io 2026-02-16T10:16:54Z
      bgppeers.metallb.io 2026-02-16T10:16:54Z
      communities.metallb.io 2026-02-16T10:16:54Z
      frrconfigurations.frrk8s.metallb.io 2026-02-16T10:16:55Z
      frrnodestates.frrk8s.metallb.io 2026-02-16T10:16:55Z
      ipaddresspools.metallb.io 2026-02-16T10:16:54Z
      l2advertisements.metallb.io 2026-02-16T10:16:54Z
      metallbs.metallb.io 2026-02-16T10:16:55Z
      servicel2statuses.metallb.io 2026-02-16T10:30:38Z
      
    • On 4.14 and 4.12, administrators should see:

      $ oc get crd | grep -iE 'metallb|frr'
      addresspools.metallb.io                                           2026-02-16T11:27:17Z
      bfdprofiles.metallb.io                                            2026-02-16T11:27:17Z
      bgpadvertisements.metallb.io                                      2026-02-16T11:27:18Z
      bgppeers.metallb.io                                               2026-02-16T11:27:17Z
      communities.metallb.io                                            2026-02-16T11:27:18Z
      ipaddresspools.metallb.io                                         2026-02-16T11:27:17Z
      l2advertisements.metallb.io                                       2026-02-16T11:27:17Z
      metallbs.metallb.io                                               2026-02-16T11:27:17Z
      

Root Cause

For several hours on February 3, 2026, Red Hat released 4.18 Red Hat Operators catalog content into 4.12-4.17 clusters.

For more information, including Diagnostic Steps, please refer to Red Hat Operator has version higher than the cluster version

Diagnostic Steps

Please refer to Red Hat Operator has version higher than the cluster version

Components
Category

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.