-
Notifications
You must be signed in to change notification settings - Fork 291
Description
What happened:
Importing a containerdisk onto a block volume loses sparseness. When I imported the centos-stream:9 containerdisk, which only uses 2 GB of non-zero data onto an empty 10 GB block volume, all 10 GB were written by CDI. Preallocation was not enabled.
What you expected to happen:
Only the non-zero data should be written to the block volume. This saves space on the underlying storage.
How to reproduce it (as minimally and precisely as possible):
Create a DataVolume from the YAML below and observe the amount of storage allocated. I used KubeSAN as the CSI driver, so the LVM lvs
command can be used to see the thin provisioned storage usage. If you don't have thin provisioned storage you could use I/O stats or tracing to determine how much data is being written.
Additional context:
I discussed this with @aglitke and we looked at the qemu-img command that is invoked:
Running qemu-img with args: [convert -t writeback -p -O raw /scratch/disk/disk.img /dev/cdi-block-volume]
Adding the --target-is-zero
option should avoid writing every block in the target block volume.
If there are concerns that some new block volumes come uninitialized (blocks not zeroed), then it should be possible to run blkdiscard --zeroout /path/to/block/device
before invoking qemu-img with --target-is-zero
. I have not tested this, but blkdiscard should zero the device efficiently and fall back to writing zero buffers on old hardware. On modern devices this would still be faster and preserve sparseness compared to writing all zeroes. On old devices it would be slower, depending on how many non-zero the input disk image has.
Environment:
- CDI version (use
kubectl get deployments cdi-deployment -o yaml
): 4.17.3 - Kubernetes version (use
kubectl version
): v1.30.5 - DV specification:
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
annotations:
cdi.kubevirt.io/storage.bind.immediate.requested: "true"
cdi.kubevirt.io/storage.import.lastUseTime: "2025-01-22T20:19:34.821435785Z"
cdi.kubevirt.io/storage.usePopulator: "true"
creationTimestamp: "2025-01-22T20:19:34Z"
generation: 1
labels:
app.kubernetes.io/component: storage
app.kubernetes.io/managed-by: cdi-controller
app.kubernetes.io/part-of: hyperconverged-cluster
app.kubernetes.io/version: 4.17.3
cdi.kubevirt.io/dataImportCron: centos-stream-9-test-import-cron-3vs0wc
instancetype.kubevirt.io/default-instancetype: u1.medium
instancetype.kubevirt.io/default-preference: centos.stream9
name: centos-stream-9-test-9515270a7f3d
namespace: default
resourceVersion: "217802868"
uid: d8d56487-5686-47d0-93d0-79de02a8c2c3
spec:
source:
registry:
url: docker://quay.io/containerdisks/centos-stream@sha256:9515270a7f3d3fd053732c15232071cb544d847e56aa2005f27002014b5becaa
storage:
resources:
requests:
storage: 10Gi
storageClassName: kubesan
- Cloud provider or hardware configuration: N/A
- OS (e.g. from /etc/os-release): N/A
- Kernel (e.g.
uname -a
): N/A - Install tools: N/A
- Others: N/A