Open
Description
The overall size of memory allocatable in DART-MPI through dart_memalloc
is currently limited to 16MB. The limit is imposed by the size of the window backing these dynamic allocations, which is allocated during initialization and is then chunked up using the infamous buddy allocator. Although no one could have expected anyone to ever require more than these 16MB we have come to witness the unexpected: @tuxamito requires larger chunks of dynamic memory for testing his sparse matrix implementation.
I see two possible ways to deal with this:
- Get rid of the buddy allocator and use dynamic windows to which
malloc
ed memory is attached. To me this seems to be the cleaner solution but it has two major downsides: we lose the ability to use shared memory optimization on these allocations and (more importantly) have to rely on dynamic windows, which (as we have seen) may have a significant performance impact (mainly due to the lack of registration of the memory with the network device and thus inhibits the use of RDMA capabilities of some modern interconnects). - Increase the size of the backing window and make it configurable through an environment variable. This seems like a less clean fix but it retains our ability to use RDMA capabilities.
While I would like to favor point 1 I am concerned about the performance impact, a sacrifice we should not make easily.
Any input is appreciated. I think we should