Skip to content

Decrease numberInstances and editing any parameters hangs operator on Updating status #1556

Open
@telepenin

Description

@telepenin

I'm just trying to decrease amount of numberInstances (2 -> 1) and edit resources.limits.cpu postgresql parameters (500m -> 600m) after provisioning minimal configuration for the latest operator (v1.6.3).

How to reproduce the issue:

  • install latest postgresql operator (v1.6.3 at this moment)
  • using postgres-operator/manifests/minimal-postgres-manifest.yaml with additional postgresql.parameters and resources sections for the editing in the next steps
apiVersion: "acid.zalan.do/v1"
kind: postgresql
metadata:
  name: acid-minimal-cluster
  namespace: default
spec:
  teamId: "acid"
  volume:
    size: 1Gi
  numberOfInstances: 2
  users:
    zalando:  # database owner
    - superuser
    - createdb
    foo_user: []  # role for application foo
  databases:
    foo: zalando  # dbname: owner
  preparedDatabases:
    bar: {}
  postgresql:
    version: "13"
    parameters:  # Expert section
      shared_buffers: "32MB"
      max_connections: "10"
      log_statement: "all"

  resources:
    requests:
      cpu: 10m
      memory: 100Mi
    limits:
      cpu: 500m
      memory: 500Mi
  • kubectl apply -f pg-manifest.yaml
  • when the status cluster is Running decrease numberOfInstances -> 1 and increase resources.limits.cpu -> 500m (also I've faced this issue if I edit something in the postgresql.parameters section)
  • The cluster hangs on Updating status
root@ntelepenin:~# k get pg -w
NAME                   TEAM   VERSION   PODS   VOLUME   CPU-REQUEST   MEMORY-REQUEST   AGE   STATUS
acid-minimal-cluster   acid   13        2      1Gi      10m           100Mi            0s    
acid-minimal-cluster   acid   13        2      1Gi      10m           100Mi            0s    Creating
acid-minimal-cluster   acid   13        2      1Gi      10m           100Mi            44s   Running
acid-minimal-cluster   acid   13        1      1Gi      10m           100Mi            3m    Running
acid-minimal-cluster   acid   13        1      1Gi      10m           100Mi            3m    Updating

operator log at this time - https://pastebin.com/zTHRa2YS

After 10 minutes status:

root@ntelepenin:~# k get pg -w
NAME                   TEAM   VERSION   PODS   VOLUME   CPU-REQUEST   MEMORY-REQUEST   AGE   STATUS
acid-minimal-cluster   acid   13        2      1Gi      10m           100Mi            0s    
acid-minimal-cluster   acid   13        2      1Gi      10m           100Mi            0s    Creating
acid-minimal-cluster   acid   13        2      1Gi      10m           100Mi            44s   Running
acid-minimal-cluster   acid   13        1      1Gi      10m           100Mi            3m    Running
acid-minimal-cluster   acid   13        1      1Gi      10m           100Mi            3m    Updating
acid-minimal-cluster   acid   13        1      1Gi      10m           100Mi            13m   UpdateFailed
acid-minimal-cluster   acid   13        1      1Gi      10m           100Mi            13m   Running

operator log - https://pastebin.com/TrAg9CtJ

time="2021-07-08T07:56:03Z" level=debug msg="unsubscribing from pod \"default/acid-minimal-cluster-1\" events" cluster-name=default/acid-minimal-cluster pkg=cluster worker=1
time="2021-07-08T07:56:03Z" level=error msg="could not sync statefulsets: could not recreate pods: could not recreate replica pod \"default/acid-minimal-cluster-1\": pod label wait timeout" cluster-name=default/acid-minimal-cluster pkg=cluster worker=1

Could be relevant - #1362

Which image of the operator are you using? Using the latest master's configuration
Where do you run it: cloud/kvm - minikube, cloud/kvm - k3s
Are you running Postgres Operator in production? no, but faced it on the k3s deploy
Type of issue? Bug report

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions