Raise VIF limit from 7 to 16 by calculating `max_grant_frames` on domain creation #6577

last-genius · 2025-07-07T09:57:34Z

xenopsd currently hardcodes 64 as the value for max_grant_frames for all domains. This limits how many grants a domain can allocate, thus limiting the number of VIFs and VBDs a domain can have. max_grant_frames can be changed for individual VMs through specifying a particular value in platform, but it's much better to be able to estimate how many grant frames a VM would need based on the number of VBDs and VIFs.

Implement this and add some notes on the difficulties and imprecisions of such an estimation.

This allows raising the number of supported VIFs from the current limit of 7 - we've picked 16 as a sweet spot, though this could be discussed further. (note that it's the allowed_vifs limit that's changed, VIFs can be created in arbitrary numbers on the CLI and by clients not honouring the allowed_vifs advice).

Given the current behaviour of the XenServer/XCP-ng system (hypervisor+drivers), where more VIFs allow for higher overall networking throughput, this is highly beneficial - in testing overall throughput with 16 VIFs was 18-27% higher than with 8 VIFs (tested with multiple iperf3 instances running on all interfaces simultaneously). The following table shows the performance per-VIF and per-VM on a host with 16 pcpus with 8 vCPUs for domU and dom0 each:

Moreover, some users coming from VMWare are used to networking setups with dozens of VIFs, and this is a step towards allowing that without encountering any other bottlenecks in the system.

Most of this work (except the last commit) was tested very thoroughly at XenServer, since it was initially intended to raise the supported limit of VBDs (it passed Ring3 BST+BVT multiple times, and ran through the config limits test several times). Raising the number of supported VBDs has other issues so was abandoned, but this PR solves all the issues for VIFs.
The new limit was tested with XCP-ng, with a new test being added to our CI (xcp-ng/xcp-ng-tests#338), which will verify the ability to hit the new limit and perform well at it.

…n creation max_maptrack_frames should only be >0 if there are reasons for other domains to grant pages to the domain being created, which should only be happening for Dom0 (not handled by the toolstack), and driver domains (currently none exist). Signed-off-by: Andrii Sultanov <[email protected]> Suggested-by: Andrew Cooper <[email protected]>

Plumb the expected number of VBDs and VIFs to xenopsd's VM_create micro-op, allowing the domain creation code to estimate how big max_grant_frames value needs to be. Previously it was hardcoded to 64 but that was found lacking in situations with a lot of VBDs and VIFs under load, as they'd run out of grant table entries and the VM would crash. Now, when fewer VIFs and VBDs are expected, fewer grant table entries are needed, and the other way around. An expectation of 1 hotplugged VIF and VBD each is accounted for as well, this can possibly be incremented in the future if we decide, say, 2 hotplugged devices are common enough. The VBD calculation is complex and approximate, as explained in the code. Size of the ring was calculated with the following program (important because struct sizes can change, and the code comment also outlines how this can be improved in the future with feature detection, in which case more constants would need to be considered), attaching it here because the BLK_RING_SIZE macro is otherwise hard to decipher by hand: ``` // Can be compiled with just gcc test.c // # ./test // 64 // 112 // 32 \#include <stdint.h> \#include <stdio.h> \#include <stddef.h> struct tester { unsigned int req_prod, req_event; unsigned int rsp_prod, rsp_event; uint8_t __pad[48]; int ring; }; typedef uint32_t grant_ref_t; typedef uint16_t blkif_vdev_t; typedef uint64_t blkif_sector_t; \#define BLKIF_MAX_SEGMENTS_PER_REQUEST 11 \#define BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST 8 struct blkif_request_segment { grant_ref_t gref; uint8_t first_sect, last_sect; }; struct blkif_request_rw { uint8_t nr_segments; blkif_vdev_t handle; uint32_t _pad1; uint64_t id; blkif_sector_t sector_number; struct blkif_request_segment seg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; } __attribute__((__packed__)); struct blkif_request_discard { uint8_t flag; blkif_vdev_t _pad1; uint32_t _pad2; uint64_t id; blkif_sector_t sector_number; uint64_t nr_sectors; uint8_t _pad3; } __attribute__((__packed__)); struct blkif_request_other { uint8_t _pad1; blkif_vdev_t _pad2; uint32_t _pad3; uint64_t id; } __attribute__((__packed__)); struct blkif_request_indirect { uint8_t indirect_op; uint16_t nr_segments; uint32_t _pad1; uint64_t id; blkif_sector_t sector_number; blkif_vdev_t handle; uint16_t _pad2; grant_ref_t indirect_grefs[BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST]; uint32_t _pad3; } __attribute__((__packed__)); struct blkif_request { uint8_t operation; union { struct blkif_request_rw rw; struct blkif_request_discard discard; struct blkif_request_other other; struct blkif_request_indirect indirect; } u; } __attribute__((__packed__)); union blkif_sring_entry { \ struct blkif_request req; \ struct blkif_request rsp; \ }; \#define __RD2(_x) (((_x) & 0x00000002) ? 0x2 : ((_x) & 0x1)) \#define __RD4(_x) (((_x) & 0x0000000c) ? __RD2((_x)>>2)<<2 : __RD2(_x)) \#define __RD8(_x) (((_x) & 0x000000f0) ? __RD4((_x)>>4)<<4 : __RD4(_x)) \#define __RD16(_x) (((_x) & 0x0000ff00) ? __RD8((_x)>>8)<<8 : __RD8(_x)) \#define __RD32(_x) (((_x) & 0xffff0000) ? __RD16((_x)>>16)<<16 : __RD16(_x)) int main () { size_t offset =offsetof(struct tester, ring); size_t size = sizeof(union blkif_sring_entry); printf("%d\n", offset); printf("%zu\n", size); printf("%d\n", __RD32((4096- offset)/size)); } ``` Signed-off-by: Andrii Sultanov <[email protected]>

64 was the old hard-coded value for max_grant_frames, so play it safe here and keep it as the lower bound - users might be used to being able to hotplug several VIFs below 7, for example, which would have been broken otherwise as we only estimate for a single hotplug with the current algorithm. Signed-off-by: Andrii Sultanov <[email protected]>

Signed-off-by: Andrii Sultanov <[email protected]>

With the previously hardcoded value of max_grant_frames, we could only support 7 VIFs at most (and fewer if there were also many VBDs), since having 9 and more VIFs would result in grant allocation errors. Now that max_grant_frames is dynamically estimated on domain creation given the number of VIFs and VBDs a VM has, we can easily support 16 VIFs. Given the current behaviour of the XenServer/XCP-ng system (hypervisor+drivers), where more VIFs allow for higher overall networking throughput, this is highly beneficial - in testing overall throughput with 16 VIFs was 18-27% higher than with 8 VIFs (tested with multiple iperf3 instances running on all interfaces simultaneously) Moreover, some users coming from VMWare are used to networking setups with dozens of VIFs, and this is a step towards allowing that without encountering any other bottlenecks in the system. NOTE: We are currently only allocating enough grants for 1 hotplugged VIF above 7. Therefore, technically we shouldn't allow creating more than one VIF when the VM is running (or paused), but there is currently no way in xapi to check how many and which VIFs were hotplugged as far as I know, and allowed_VIFs are honoured by clients on VIF creation, not hotplug. Since this is in keeping with the previous behaviour of this field (if VM has many VBDs, it wouldn't have been able to allocate 7 VIFs before) as an "advice", and not a "guarantee" or "limit", I've decided to keep it as-is. A more detailed technical explanation of the supported limit should be described elsewhere in support statements. Signed-off-by: Andrii Sultanov <[email protected]>

psafont · 2025-07-07T10:33:58Z

I think this needs to be reflected in the estimated memory overhead per VM that xapi calculates, to keep it pessimistic: https://github.com/xapi-project/xen-api/blob/master/ocaml/xapi/memory_check.ml#L241

edwintorok · 2025-07-07T10:34:45Z

I think there was also a problem in general even with the current limits that if you had too many VBDs then your VIFs stopped working, or the other way around, which we tried to work around by calculating the number of grant frames dynamically, but that ran into a problem with a particular GFS2 test that kept failing (for seemingly unrelated reasons). I forgot where we left off with that test though, was it basically working?

last-genius · 2025-07-07T10:36:38Z

I think there was also a problem in general even with the current limits that if you had too many VBDs then your VIFs stopped working, or the other way around

that's true - i note a similar case in the last commit

which we tried to work around by calculating the number of grant frames dynamically, but that ran into a problem with a particular GFS2 test that kept failing (for seemingly unrelated reasons). I forgot where we left off with that test though, was it basically working?

it was basically working, but xenopsd timed out on plugging VBDs and we never got to the bottom of it.

edwintorok · 2025-07-07T10:46:22Z

supported VIFs from the current limit of 7

There is a limit of 8 in netback, but checking the source code that is a limit for the number of queues (on the same interface), not the number of interfaces.
The number of queues is in turn limited by the number of vCPUs. If you give a guest 16 vCPUs and 2 VIFs, it should already be able to use 16 connections in parallel (more is unlikely to help since we only have 16 vCPUs in Dom0).

So I assume in your tests the guests had <16 vCPUs?

It'd be useful if your table contained the effective number of network queues (you can see this in xenstore, there are entries like vif/0/queue-0, vif/0/queue-1, etc.
You don't need to rerun the tests, but I'd hope you have some logs/bugtools from the tests that you've already run and can check the number of vCPUs and number of queues (in xenstore-ls.out, or xenstored-access.log).

last-genius · 2025-07-07T10:53:06Z

So I assume in your tests the guests had <16 vCPUs?

I tested with 8 vCPUs, yes

It'd be useful if your table contained the effective number of network queues (you can see this in xenstore, there are entries like vif/0/queue-0, vif/0/queue-1, etc.

There are 8 queues (0-7) for every VIF in my tests.

last-genius · 2025-07-14T10:25:26Z

I think this needs to be reflected in the estimated memory overhead per VM that xapi calculates, to keep it pessimistic: https://github.com/xapi-project/xen-api/blob/master/ocaml/xapi/memory_check.ml#L241

I'm not quite sure how that number was calculated initially (extra_internal_mib is 1 for all models). With my calculations, the current value of max_grant_frames means a domain can allocate up to 64 frames * 512 grants each = 128mb, so the estimate is currently wrong.

With the new approach, there is no inherent limit, the more VBDs/VIFs a domain has, the more it would be able to allocate, we can't get a pessimistic estimate here, memory_check would need to go through the same algorithm xenopsd uses - so it should be made generic

psafont · 2025-07-14T10:50:36Z

I would start by adding in the assumption of 16 VIFs used per domain, and account the grant frames needed to use them.

I agree with your point of of the current memory model not being fit for purpose. In the long term an extension to make it aware of the grant frames that each domain will be able to use would be best

last-genius · 2025-07-14T11:18:48Z

Although - the granted pages should be counted towards the domain's normal memory usage, not the overhead, unless I'm confusing things here.

psafont · 2025-07-14T12:07:06Z

It is? then disregard my comment, I though it was counted as overhead

last-genius · 2025-07-14T14:20:59Z

These constants come from the upstream xl, I think (extra internal is #define LIBXL_MAXMEM_CONSTANT 1024) - but they aren't exactly the same (e.g. extra external for HVM should be 2, not 1, since it's defined as #define LIBXL_HVM_EXTRA_MEMORY 2048 in xen)....

Andrii Sultanov and others added 5 commits July 4, 2025 15:13

xenopsd: Don't iterate over StringMaps twice

593bd8c

Signed-off-by: Andrii Sultanov <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Raise VIF limit from 7 to 16 by calculating `max_grant_frames` on domain creation #6577

Raise VIF limit from 7 to 16 by calculating `max_grant_frames` on domain creation #6577

Uh oh!

last-genius commented Jul 7, 2025

Uh oh!

psafont commented Jul 7, 2025

Uh oh!

edwintorok commented Jul 7, 2025 •

edited

Loading

Uh oh!

last-genius commented Jul 7, 2025

Uh oh!

edwintorok commented Jul 7, 2025

Uh oh!

last-genius commented Jul 7, 2025 •

edited

Loading

Uh oh!

last-genius commented Jul 14, 2025 •

edited

Loading

Uh oh!

psafont commented Jul 14, 2025 •

edited

Loading

Uh oh!

last-genius commented Jul 14, 2025

Uh oh!

psafont commented Jul 14, 2025

Uh oh!

last-genius commented Jul 14, 2025

Uh oh!

Uh oh!

Raise VIF limit from 7 to 16 by calculating max_grant_frames on domain creation #6577

Are you sure you want to change the base?

Raise VIF limit from 7 to 16 by calculating max_grant_frames on domain creation #6577

Uh oh!

Conversation

last-genius commented Jul 7, 2025

Uh oh!

psafont commented Jul 7, 2025

Uh oh!

edwintorok commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

last-genius commented Jul 7, 2025

Uh oh!

edwintorok commented Jul 7, 2025

Uh oh!

last-genius commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

last-genius commented Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

psafont commented Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

last-genius commented Jul 14, 2025

Uh oh!

psafont commented Jul 14, 2025

Uh oh!

last-genius commented Jul 14, 2025

Uh oh!

Uh oh!

Raise VIF limit from 7 to 16 by calculating `max_grant_frames` on domain creation #6577

Raise VIF limit from 7 to 16 by calculating `max_grant_frames` on domain creation #6577

edwintorok commented Jul 7, 2025 •

edited

Loading

last-genius commented Jul 7, 2025 •

edited

Loading

last-genius commented Jul 14, 2025 •

edited

Loading

psafont commented Jul 14, 2025 •

edited

Loading