Skip to content

Update ceph-with-rook.md #11120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

suse-coder
Copy link
Contributor

add nbd module one has to enable and single node setup

Pull Request

What? (description)

Why? (reasoning)

Acceptance

Please use the following checklist:

  • you linked an issue (if applicable)
  • you included tests (if applicable)
  • you ran conformance (make conformance)
  • you formatted your code (make fmt)
  • you linted your code (make lint)
  • you generated documentation (make docs)
  • you ran unit-tests (make unit-tests)

See make help for a description of the available targets.

add nbd module one has to enable and single node setup

Signed-off-by: suse-coder <[email protected]>
Comment on lines +121 to +135
### 1. Enable the Ceph Kernel Module

Talos includes the `nbd` kernel module, but it needs to be explicitly enabled.

**Create a patch file** (`patch.values.yaml`):

```yaml
machine:
kernel:
modules:
- name: nbd
```
**Apply the kernel module patch**:
```shell
talosctl -n 192.168.178.79 patch mc --patch @./terraform/talos/patch/patch.yaml

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

krbd kernel module works by default. Is there a reason to change to rbd-nbd and does the rook ceph cluster choose the nbd module over krbd if it's available?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when I installed ceph it had an error that nbd was not enabled. So I did. nbd ist more modern and has way better way to snapshot volumes with not getting inconsistent journals: https://engineering.salesforce.com/mapping-kubernetes-ceph-volumes-the-rbd-nbd-way-21f7c4161f04/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That post seems fairly old since it's referencing ceph jewel and centos 7. Although rbd-nbd could provide earlier access to newer features, I believe there might be performance impact vs using the krbd module. I've provisioned multiple talos servers (1.9.x - 1.10.x) with ceph rook and did not require to add additional kernel modules loaded. What version of talos and rook are you using maybe I can spin up a quick test to check confirm this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Talos: v.1.10.3
ceph-rook (operator): 1.17.2
rook-ceph-cluster: 1.17.2

Copy link
Contributor Author

@suse-coder suse-coder May 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

values.yaml:

storage:
  useAllNodes: false
  useAllDevices: true
  config:
    allowMultiplePerNode: true
  nodes:
    - name: talos-mec-lba


placement:
  all:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: kubernetes.io/hostname
                operator: In
                values:
                  - talos-mec-lba
    tolerations:
      - key: "node-role.kubernetes.io/control-plane"
        operator: "Exists"
        effect: "NoSchedule"

cephClusterSpec:
  mon:
    count: 1
    allowMultiplePerNode: true
  mgr:
    count: 1
    allowMultiplePerNode: true
  mds:
    count: 0
    allowMultiplePerNode: true
  rgw:
    count: 0
    allowMultiplePerNode: true
  crashCollector:
    disable: true
  dashboard:
    enabled: true
  pool:
    replicated:
      size: 1
      minSize: 1

cephCSI:
  csiCephFS:
    provisionerReplicas: 1
    pluginReplicas: 1
    placement:
      podAntiAffinity: null
  csiRBD:
    provisionerReplicas: 1
    pluginReplicas: 1
    placement:
      podAntiAffinity: null


snapshotclass.yaml:
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: csi-ceph-rbd-snapclass
  annotations:
    k10.kasten.io/is-snapshot-class: "true"
driver: rook-ceph.rbd.csi.ceph.com
deletionPolicy: Delete
parameters:
  csi.storage.k8s.io/snapshotter-secret-name: rook-csi-rbd-provisioner
  csi.storage.k8s.io/snapshotter-secret-namespace: rook-ceph
  clusterID: rook-ceph

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where I am stuck is

2025-06-02 21:55:00.685887 E | clusterdisruption-controller: failed to get OSD status: failed to get osd metadata: exit status 1
2025-06-02 21:55:13.396848 I | op-mon: mons running: [a b]
2025-06-02 21:55:15.791891 E | clusterdisruption-controller: failed to get OSD status: failed to get osd metadata: exit status 1

I have a /dev/sdb mounted into one worker node.

Do I need to wipe that or do else?

# ------------------------------------------------------------------------------
cephClusterSpec:
  mon:
    count: 1
    allowMultiplePerNode: true
  dashboard:
    enabled: true
    ssl: false                    # easier for a lab; switch to true in prod

  # ---------- Storage (OSDs) ----------
  storage:
    useAllNodes: false            # we will list the single node explicitly
    useAllDevices: false          # don’t blindly grab every block dev
    nodes:
      - name: talos-mec-lba       # MUST match `kubectl get nodes -o wide`
        devices:
          - name: /dev/sdb

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you need to wipe the disks. You can see that the disk probably still belongs to a previous ceph cluster by looking at the osd-prepare logs.

Follow these instruction to wipe them: https://rook.io/docs/rook/latest-release/Getting-Started/ceph-teardown/#zapping-devices

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am stuck even before that phase in: configuring MONs

2552906598] state: up:replay) since 675.073
debug 2025-06-03T14:10:53.096+0000 7f1c2880d640  0 cephx server client.admin:  unexpected key: req.key=8973fbb8653eb28e expected_key=1c5ab51d2f84e6bc
debug 2025-06-03T14:10:53.356+0000 7f1c2880d640  0 cephx server client.admin:  unexpected key: req.key=4a4cd93568452069 expected_key=93f905b60d3c2bbf
debug 2025-06-03T14:10:54.764+0000 7f1c2a9cf640  0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch
debug 2025-06-03T14:10:54.768+0000 7f1c2a9cf640  0 log_channel(audit) log [DBG] : from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished
audit 2025-06-03T14:10:54.771317+0000 mon.a (mon.0) 144 : audit [DBG] from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch
audit 2025-06-03T14:10:54.771543+0000 mon.a (mon.0) 145 : audit [DBG] from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please read the documentation link provided above (scroll to the top); you need to cleanup the /var/lib/rook directory before creating a new cluster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This was it:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: disk-clean
spec:
restartPolicy: Never
nodeName: talos-mec-lba
volumes:

  • name: rook-data-dir
    hostPath:
    path: /var/lib/rook
    containers:
  • name: disk-clean
    image: busybox
    securityContext:
    privileged: true
    volumeMounts:
    • name: rook-data-dir
      mountPath: /node/rook-data
      command: ["/bin/sh", "-c", "rm -rf /node/rook-data/*"]
      EOF

Maybe one has to write to it more that when no new osd are created and it is stuck in configure mons phase one needs to do that.

@@ -100,6 +100,188 @@ ceph-bucket rook-ceph.ceph.rook.io/bucket Delete Immediate
ceph-filesystem rook-ceph.cephfs.csi.ceph.com Delete Immediate true 77m
```

## 🔧 Single Node Setup Instructions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't usually support/modify old Talos documentation, you should probably take this to website/content/v1.11

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I don't know, but this is very niche use case which shouldn't be used in general (single-node setup). I would probably move this to another document, probably in advanced/ folder and link it from here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what also would be great if one copy a section that it doesnt copy the results of the cli (currently in the docs). Or dont have the results at all in the cli to copy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants