Remote entity reservation v6 #18525

ElliottjPierce · 2025-03-25T01:22:33Z

Objective

Like the other 5 versions, the goal is to allow entity reservation from any thread asynchronously. This is primarily important for assets as entities. fixes #18003.

Solution

This differs from #18440 (v5) by reserving one by one instead of in bulk. This will probably have worse performance, but it's definitely passible, especially given that async reservation is a relatively slow path. In fact, with a high batch size, this may even be faster than v5. See v5 benchmarks for the impact on the rest of the ecs, as the changes there are almost identical.

Testing

The test from v5 has been ported over.

Questions for review

If we are ok with not exposing the batch size for remote, we could make this use a bounded queue, which would improve performance of remote reservation.

chescock · 2025-03-26T14:56:39Z

crates/bevy_ecs/src/entity/mod.rs

+    source: Arc<RemoteEntitiesInner>,
+}
+
+impl Future for RemoteReservedEntities {


If I'm reading this right, the manual Future impl and the ConcurrentQueue<Waker> are here to wake up readers when new entities are pushed the the queue. Is that right?

I think that's essentially what a Channel does over a Queue, and you can simplify the code if you switch to using a Channel. async_channel seems to already be in our dependency tree and no_std compatible.

If you're able to lock down remote_batch_size at creation time, then I think you can handle both requests and reserved with a single async_channel::bounded(remote_batch_size) by having fn fulfill() keep calling try_send() until it fails because the channel is full. You could even use Receiver<Entity> as the public type and skip making a RemoteEntities wrapper.

(try_send() looks like it just does some Relaxed and Acquire loads on failure, so it should be almost free. is_full() and len() seem to do SeqCst loads, which I believe do a full fence and are expensive, so you wouldn't want to use those.)

If I'm reading this right, the manual Future impl and the ConcurrentQueue<Waker> are here to wake up readers when new entities are pushed the the queue. Is that right?

Yup. That's right.

I think that's essentially what a Channel does over a Queue, and you can simplify the code if you switch to using a Channel. async_channel seems to already be in our dependency tree and no_std compatible.

Yeah, I wanted to use a channel, but I could't find one that was no_std. async_channel looks like it will work. I'll try that out in a bit.

If you're able to lock down remote_batch_size at creation time, then I think you can handle both requests and reserved with a single async_channel::bounded(remote_batch_size) by having fn fulfill() keep calling try_send() until it fails because the channel is full.

That is my intention but I'm not sure "If you're able to lock down remote_batch_size at creation time". But if that's a possibility, we're on the same page.

You could even use Receiver<Entity> as the public type and skip making a RemoteEntities wrapper.

Technically. But I see the backend for this changing potentially. So I'd like to keep the wrapper for now.

(try_send() looks like it just does some Relaxed and Acquire loads on failure, so it should be almost free. is_full() and len() seem to do SeqCst loads, which I believe do a full fence and are expensive, so you wouldn't want to use those.)

Yeah. This is a good end game. I didn't do that because reserve_entities needs a definite amount ahead of time. As you point out, we could do alloc for now, but that would prevent it from being flushed. Plus, IDK if we'll always have &mut access to Entities. I mean, probably, but we may need to try_fulfill more often than we flush. Maybe. I'll try some stuff out. We can always make a &Entities version later.

chescock · 2025-03-26T15:14:07Z

crates/bevy_ecs/src/entity/mod.rs

+
+impl RemoteEntitiesInner {
+    fn fulfill(&self, entities: &Entities, batch_size: NonZero<u32>) {
+        self.is_waiting.store(0, Ordering::Relaxed);


This reserves exactly batch_size entities per flush, regardless of how many requests there were. If we make enough async requests, it could take a while for them all to get fulfilled!

Rather than counting waiters, it might make sense to have a request_count, increment it before popping, and do reserve_entities(request_count.swap(0, Relaxed)). If you pre-fill the queue with batch_size then that will refill the original buffer every time.

(My proposal above has exactly the same problem, so maybe we can't use async_channel::bounded.)

This reserves exactly batch_size entities per flush, regardless of how many requests there were. If we make enough async requests, it could take a while for them all to get fulfilled!

Yeah. It's not ideal. I don't anticipate this being a real issue though (with a good batch_size), but it's bugging me too.

Rather than counting waiters, it might make sense to have a request_count, increment it before popping, and do reserve_entities(request_count.swap(0, Relaxed)). If you pre-fill the queue with batch_size then that will refill the original buffer every time.

That's an excellent idea! Brilliant actually. It would keep the number reserved stable, and no reserve would be waiting for more than a single flush. I'll need to split reserve_entities(request_count.swap(0, Relaxed)) up to help with branch prediction, but I really like this idea.

(My proposal above has exactly the same problem, so maybe we can't use async_channel::bounded.)

Pros and cons. Might be possible to do both. A fast bounded channel for standard operations. But if it runs out, we increment request_count and fulfill on a boundless. That might be slightly over engineered, but it would work. I'll hold off for now, but I'll keep this in mind for a follow up.

chescock · 2025-03-26T15:17:14Z

crates/bevy_ecs/src/entity/mod.rs

+    fn fulfill(&self, entities: &Entities, batch_size: NonZero<u32>) {
+        self.is_waiting.store(0, Ordering::Relaxed);
+        if self.reserved.is_empty() {
+            for reserved in entities.reserve_entities(batch_size.get()) {


Since you have &mut Entities when calling try_fulfill(), you can use alloc() instead of reserve_entities() to avoid all the atomic operations. (You'd have to move the try_fulfill() call to the end of flush(), of course.)

alloc prevents them from being flushed. Maybe we just let the asset system make them valid, but I'd kinda like to hand the asset server a blank entity rather than an invalid one. I have an idea here though. I'll get back to you.

ElliottjPierce

Thanks for the review @chescock. Brilliant as always!

ElliottjPierce · 2025-03-26T15:41:26Z

crates/bevy_ecs/src/entity/mod.rs

+    source: Arc<RemoteEntitiesInner>,
+}
+
+impl Future for RemoteReservedEntities {


If I'm reading this right, the manual Future impl and the ConcurrentQueue<Waker> are here to wake up readers when new entities are pushed the the queue. Is that right?

Yup. That's right.

I think that's essentially what a Channel does over a Queue, and you can simplify the code if you switch to using a Channel. async_channel seems to already be in our dependency tree and no_std compatible.

Yeah, I wanted to use a channel, but I could't find one that was no_std. async_channel looks like it will work. I'll try that out in a bit.

If you're able to lock down remote_batch_size at creation time, then I think you can handle both requests and reserved with a single async_channel::bounded(remote_batch_size) by having fn fulfill() keep calling try_send() until it fails because the channel is full.

That is my intention but I'm not sure "If you're able to lock down remote_batch_size at creation time". But if that's a possibility, we're on the same page.

You could even use Receiver<Entity> as the public type and skip making a RemoteEntities wrapper.

Technically. But I see the backend for this changing potentially. So I'd like to keep the wrapper for now.

(try_send() looks like it just does some Relaxed and Acquire loads on failure, so it should be almost free. is_full() and len() seem to do SeqCst loads, which I believe do a full fence and are expensive, so you wouldn't want to use those.)

Yeah. This is a good end game. I didn't do that because reserve_entities needs a definite amount ahead of time. As you point out, we could do alloc for now, but that would prevent it from being flushed. Plus, IDK if we'll always have &mut access to Entities. I mean, probably, but we may need to try_fulfill more often than we flush. Maybe. I'll try some stuff out. We can always make a &Entities version later.

ElliottjPierce · 2025-03-26T15:48:36Z

crates/bevy_ecs/src/entity/mod.rs

+
+impl RemoteEntitiesInner {
+    fn fulfill(&self, entities: &Entities, batch_size: NonZero<u32>) {
+        self.is_waiting.store(0, Ordering::Relaxed);


This reserves exactly batch_size entities per flush, regardless of how many requests there were. If we make enough async requests, it could take a while for them all to get fulfilled!

Yeah. It's not ideal. I don't anticipate this being a real issue though (with a good batch_size), but it's bugging me too.

Rather than counting waiters, it might make sense to have a request_count, increment it before popping, and do reserve_entities(request_count.swap(0, Relaxed)). If you pre-fill the queue with batch_size then that will refill the original buffer every time.

That's an excellent idea! Brilliant actually. It would keep the number reserved stable, and no reserve would be waiting for more than a single flush. I'll need to split reserve_entities(request_count.swap(0, Relaxed)) up to help with branch prediction, but I really like this idea.

(My proposal above has exactly the same problem, so maybe we can't use async_channel::bounded.)

Pros and cons. Might be possible to do both. A fast bounded channel for standard operations. But if it runs out, we increment request_count and fulfill on a boundless. That might be slightly over engineered, but it would work. I'll hold off for now, but I'll keep this in mind for a follow up.

ElliottjPierce · 2025-03-26T15:50:24Z

crates/bevy_ecs/src/entity/mod.rs

+    fn fulfill(&self, entities: &Entities, batch_size: NonZero<u32>) {
+        self.is_waiting.store(0, Ordering::Relaxed);
+        if self.reserved.is_empty() {
+            for reserved in entities.reserve_entities(batch_size.get()) {


alloc prevents them from being flushed. Maybe we just let the asset system make them valid, but I'd kinda like to hand the asset server a blank entity rather than an invalid one. I have an idea here though. I'll get back to you.

andriyDev · 2025-03-27T03:23:27Z

crates/bevy_ecs/src/entity/mod.rs

+    /// This is the number of entities we keep ready for remote reservations via [`RemoteEntities::reserve_entity`].
+    /// A value too high can cause excess memory to be used, but a value too low can cause additional waiting.
+    pub entities_hot_for_remote: u32,
+    remote: Arc<RemoteEntitiesInner>,


Nit: Could we just make this store a RemoteEntities? That way we don't have to worry about constructing one in get_remote and can just blindly clone. Also means if we change the details of RemoteEntities we don't have to worry about how we construct it each time.

We could do this, and if v6 doesn't end up evolving much, I'm on board with this. However, many of the ways we can improve this is by caching some atomic results in a non-atomic that needs &mut. More specifically, we'd need RemoteEntities to have additional per-instance state in addition to the shared state in Arc<RemoteEntitiesInner>. But Entities isn't a remote instance, so it doesn't make sense to include that per-instance state in Entities.

I used this a lot in v4. If v6 is merged, I'll start slowly moving concepts from v4 into it to try to improve performance. If none of those changes are merged, then I can follow up by simplifying this per your suggestion. It's preemptive future proofing if that makes sense.

crates/bevy_ecs/src/entity/mod.rs

andriyDev · 2025-03-27T03:49:37Z

crates/bevy_ecs/src/entity/mod.rs

+        keep_hot: u32,
+    ) {
+        let to_fulfill = entities.remote.recent_requests.swap(0, Ordering::Relaxed);
+        let current_hot = entities.remote.keep_hot.load(Ordering::Relaxed);


I don't understand why entities.remote.keep_hot needs to be an Atomic. Could we just move it into Entities and then handle it with just regular u32s?

Similarly is keep_hot the correct name for this? Maybe we should call it like in_channel or something? (they may have already been claimed by recent_request but I think it's close enough?)

I don't understand why entities.remote.keep_hot needs to be an Atomic. Could we just move it into Entities and then handle it with just regular u32s?

For now, it doesn't. However, the extra 2 atomic ops are nothing compared to the like 10 atomic ops per entity to push onto the queue. (This is something I think a custom queue implementation can do much better. I did this for v4).

Given that it's not a performance concern, I like to keep the state about remote entities on the type itself rather than somewhere on Entities. We can change this later, but this could still come in handy if we want to do fulfillment with only &Entities in the future (maybe).

Similarly is keep_hot the correct name for this? Maybe we should call it like in_channel or something? (they may have already been claimed by recent_request but I think it's close enough?)

That's a more precise name, so I changed the internals to use that. But the field we expose to users I've kept as "hot" just because the channel detail might change later.

For now, it doesn't. However, the extra 2 atomic ops are nothing compared to the like 10 atomic ops per entity to push onto the queue. (This is something I think a custom queue implementation can do much better. I did this for v4).

Given that it's not a performance concern, I like to keep the state about remote entities on the type itself rather than somewhere on Entities. We can change this later, but this could still come in handy if we want to do fulfillment with only &Entities in the future (maybe).

Another way to approach this is to have separate structs for each side, like Sender and Receiver on channels. That keeps the remote reservation logic together, but still keeps the provider and client data separate. That would also remove the need to close() the channel explicitly, since it automatically closes when the last Sender is dropped.

I think the issue with using an atomic here is that the ordinary load followed by a store looks like a race condition, while if you had &mut then it would be obvious that nothing could change it in between. Performance shouldn't be an issue; relaxed load and store calls like this are no more expensive than non-atomic ones.

crates/bevy_ecs/src/entity/mod.rs

chescock

Looks good! I really like how small this wound up being!

I left a few more nits, but nothing blocking (assuming async_channel gets updated, which seems promising).

crates/bevy_ecs/src/entity/mod.rs

chescock · 2025-03-26T18:31:41Z

crates/bevy_ecs/src/entity/mod.rs

        self.meta.clear();
        self.pending.clear();
        *self.free_cursor.get_mut() = 0;
+        self.remote.close();


This will disconnect any outstanding RemoteEntities values so that they start failing to reserve. Is that really what we want to do in this case? It might be less disruptive to simply drain the queue.

Although it probably doesn't matter, since nobody will call clear() on a real application.

I prefer this behavior. What if a clear happens while an asset is loading? Now we have to make sure every s included entity is valid instead of just making sure one is. Right now, the minute it clears, all current remote reservers fail, which I think a more transparent way of error handling.

(But like you said, it doesn't really mater).

chescock · 2025-03-27T15:57:37Z

crates/bevy_ecs/src/entity/mod.rs

+        keep_hot: u32,
+    ) {
+        let to_fulfill = entities.remote.recent_requests.swap(0, Ordering::Relaxed);
+        let current_hot = entities.remote.keep_hot.load(Ordering::Relaxed);


For now, it doesn't. However, the extra 2 atomic ops are nothing compared to the like 10 atomic ops per entity to push onto the queue. (This is something I think a custom queue implementation can do much better. I did this for v4).

Given that it's not a performance concern, I like to keep the state about remote entities on the type itself rather than somewhere on Entities. We can change this later, but this could still come in handy if we want to do fulfillment with only &Entities in the future (maybe).

Another way to approach this is to have separate structs for each side, like Sender and Receiver on channels. That keeps the remote reservation logic together, but still keeps the provider and client data separate. That would also remove the need to close() the channel explicitly, since it automatically closes when the last Sender is dropped.

I think the issue with using an atomic here is that the ordinary load followed by a store looks like a race condition, while if you had &mut then it would be obvious that nothing could change it in between. Performance shouldn't be an issue; relaxed load and store calls like this are no more expensive than non-atomic ones.

chescock · 2025-03-27T16:03:41Z

crates/bevy_ecs/src/entity/mod.rs

+        let to_fulfill = entities.remote.recent_requests.swap(0, Ordering::Relaxed);
+        let current_in_channel = entities.remote.in_channel.load(Ordering::Relaxed);
+        let should_reserve = (to_fulfill + in_channel).saturating_sub(current_in_channel); // should_reserve = to_fulfill + (in_channel - current_in_channel)
+        let new_in_channel = (current_in_channel + should_reserve).saturating_sub(to_fulfill); // new_in_channel = current_in_channel + (should_reserve - to_fulfill).


I think the second subtraction can never saturate, and could be an ordinary -. new_in_channel will be more than in_channel if the first subtraction saturates, and equal to in_channel if it doesn't, but it can never be less than in_channel, so it can never be less than zero.

(It's possible that this would be more clear if should_reserve were calculated using checked_sub and an if, especially since we never need to allocate entities if it overflows.)

I think the second subtraction can never saturate, and could be an ordinary -. new_in_channel will be more than in_channel if the first subtraction saturates, and equal to in_channel if it doesn't, but it can never be less than in_channel, so it can never be less than zero.

I agree. I thought this was the case when I made it but I was too lazy to do the proof lol. But I also like the redundancy.

(It's possible that this would be more clear if should_reserve were calculated using checked_sub and an if, especially since we never need to allocate entities if it overflows.)

I agreed with this when I read it, but when I tried it, it felt more confusing.

ElliottjPierce · 2025-03-27T17:12:34Z

I'm not sure why this keeps happening, where GitHub doesn't let me reply.

Another way to approach this is to have separate structs for each side, like Sender and Receiver on channels. That keeps the remote reservation logic together, but still keeps the provider and client data separate. That would also remove the need to close() the channel explicitly, since it automatically closes when the last Sender is dropped.

I can see the appeal here, but I opted not to do this for a few reasons. 1) We might move away from the channel in the near future. 2) We don't necessarily want that closing behavior. and 3) We may want to be able to push to the channel apart from entities. (Ex: I reserved more entities than I need, so I'll push them instead of flushing and freeing.)

I think the issue with using an atomic here is that the ordinary load followed by a store looks like a race condition, while if you had &mut then it would be obvious that nothing could change it in between. Performance shouldn't be an issue; relaxed load and store calls like this are no more expensive than non-atomic ones.

Agreed. Hopefully this is more clear now that fulfillment is in &mut Entities.

chescock · 2025-03-27T20:54:08Z

crates/bevy_ecs/src/entity/mod.rs

+        let in_channel = self.entities_hot_for_remote;
+        let mut init_allocated = init_allocated;
+        let to_fulfill = self.remote.recent_requests.swap(0, Ordering::Relaxed);
+        let current_in_channel = self.remote.in_channel.load(Ordering::Relaxed);


Another option here is to store the (signed) difference between in_channel and recent_requests. Then you only have one value to update! You can use the existing IdCursor type for signed numbers. This would reduce to something like:

if self.remote.net_in_channel.load(Ordering::Relaxed) < self.entities_hot_for_remote { // ... let old_net_in_channel = self.remote.net_in_channel .fetch_max(self.entities_hot_for_remote, Ordering::Relaxed); let to_fulfill = self.entities_hot_for_remote - old_net_in_channel; if to_fulfill > 0 {

I suppose we could, but I I'm not sure what the motivation would be. IIUC, it's the same number of atomic ops and the same amount of data stored. Actually its 8 bytes instead of 4, but still. I'm not opposed to it though, if there's a reason to do this that I'm just missing.

Oh, I was just trying to simplify the saturating_sub arithmetic by having one counter to modify instead of two.

crates/bevy_ecs/src/entity/mod.rs

andriyDev

At this point I'm fine with the code itself. Whether we take this route is another discussion. IOW, if we decide this is the PR we want to get merged, I think my review is valid, but that doesn't mean this is a "vote" for this PR :P My ideal (and I think yours as well) is fully parallelized entity reservation. But this is a good second.

crates/bevy_ecs/src/entity/mod.rs

ElliottjPierce · 2025-04-28T16:57:05Z

Heads up that the async channel issue is fixed now. See here. IDK when the next release will be, but I can fix CI here when it's out. (Unless we do v9, which I tend to prefer.)

ElliottjPierce added 4 commits March 24, 2025 20:45

completed v6

5cadec0

minor improvements

e271786

improved test and fixed deadlock

a7124d0

fix docs

cdd00ce

alice-i-cecile added C-Feature A new feature, making something new possible A-ECS Entities, components, systems, and events X-Controversial There is active debate or serious implications around merging this PR labels Mar 25, 2025

alice-i-cecile added this to the 0.16 milestone Mar 25, 2025

alice-i-cecile added the S-Needs-Review Needs reviewer attention (from anyone!) to move forward label Mar 25, 2025

alice-i-cecile modified the milestones: 0.16, 0.17 Mar 25, 2025

andriyDev self-requested a review March 25, 2025 18:18

chescock reviewed Mar 26, 2025

View reviewed changes

ElliottjPierce commented Mar 26, 2025

View reviewed changes

ElliottjPierce added 5 commits March 26, 2025 12:23

switch to async channel

db53dda

improved tests

8860106

keep some hot

b7fd02d

improve test

db84c5a

fix no_std compile maybe

74de9c0

andriyDev reviewed Mar 27, 2025

View reviewed changes

ElliottjPierce added 4 commits March 27, 2025 09:48

improved tests

8f9f544

improved naming

75844c0

comments on RemoteEntitiesInner

51e115b

better test names

5ec8f4d

chescock approved these changes Mar 27, 2025

View reviewed changes

moved fulfillment

af38756

fix doc

2082cbf

removed unneeded saturating sub

cf65840

chescock reviewed Mar 27, 2025

View reviewed changes

crates/bevy_ecs/src/entity/mod.rs Show resolved Hide resolved

andriyDev approved these changes Mar 28, 2025

View reviewed changes

crates/bevy_ecs/src/entity/mod.rs Outdated Show resolved Hide resolved

ElliottjPierce added 2 commits March 28, 2025 09:27

small rename

eb4d48b

add reservation

3f85f93

ElliottjPierce mentioned this pull request Mar 29, 2025

Remote entity reservation v7 #18611

Closed

chescock mentioned this pull request Apr 2, 2025

Remote entity reservation v9 #18670

Open

atlv24 modified the milestones: 0.17, 0.18 Jul 8, 2025

Uh oh!

Remote entity reservation v6 #18525

Are you sure you want to change the base?

Remote entity reservation v6 #18525

Uh oh!

Conversation

ElliottjPierce commented Mar 25, 2025

Objective

Solution

Testing

Questions for review

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ElliottjPierce left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

chescock left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ElliottjPierce commented Mar 27, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

andriyDev left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ElliottjPierce commented Apr 28, 2025

Uh oh!

Reviewers