Skip to content

Recreation of DB clusters due changing nodeAffinity term order #1996

Open
@ljcesca

Description

@ljcesca
  • Which image of the operator are you using? registry.opensource.zalan.do/acid/postgres-operator:v1.8.2
  • Where do you run it - cloud or metal? Kubernetes or OpenShift? AWS EKS
  • Are you running Postgres Operator in production? yes
  • Type of issue? Bug report

We experienced an issue similar to #924 due to changes in ordering of nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms.matchExpressions across syncs of clusters.

Our operator configuration was setting two node_readiness_labels via:

node_readiness_label:
  kubernetes.io/arch: amd64
  postgres-cluster: "1"

And an additional label via the Cluster spec:

nodeAffinity:
  requiredDuringSchedulingIgnoredDuringExecution:
    nodeSelectorTerms:
    - matchExpressions:
      - key: postgres-plan-small
        operator: In
        values:
        - "1"

We've worked around this for now by removing the node_readiness_label configuration, but would like to be able to use this again the future.

We were able to capture the StatefulSet before and after and confirmed that the order of nodeAffinity.requiredDuringSchedulingIgnoredDuringExecution.nodeSelectorTerms.matchExpressions was changing. Which caused the cluster to be re-synced due to Cluster.compareStatefulSetWith.

I'm happy to work on a PR that fixes this if you agree that changes to Cluster.compareStatefulSetWith is the appropriate approach. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions