Skip to content

WebGPURenderer: Introduce dispatchWorkgroupsIndirect #31488

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: dev
Choose a base branch
from

Conversation

Spiri0
Copy link
Contributor

@Spiri0 Spiri0 commented Jul 23, 2025

Related issue: #30982

This small extension allows you to control the dispatchSize with the GPU.
The benefit of GPU-side dispatchSize control lies in regressions.

Regressions aren't feasible on the GPU. They require multiple computes. Here's a clear example. My voxelizer. Here I'm voxelizing the BlackPearl. This is incredibly fast with compute shaders. Surface voxels are green. I also voxel the volume with yellow voxels ( important for buoyancy ), and the floodfill mechanism requires 52 iterations for this. With adaptive dispatchSize, the dispatchSize can always be adjusted to the remaining number of voxels.

image image

This is also important for BVH if you want to do it with the GPU because they also require regressions.

The extension was pleasingly simple. You just have to follow the WebGPU guidelines:
No read/write in the same pass. However, it's logical that a compute shader can't compute its own dispatchSize. This requires a separate compute that runs with normal dispatch. Since this usually only requires a count of 1, it's definitely worth it for regressions. Like in Unreal and Unity

// Setup
dispatchBuffer = new IndirectStorageBufferAttribute( new Uint32Array( 3 ), 1 );

computeDispatchShader = Fn or wgslFn

computeDispatchSize = computeDispatchShader( computeDispatchShaderParams ).compute( 1 );


// In the renderloop
renderer.compute( computeDispatchSize );
renderer.compute( someComputeNode, dispatchBuffer );

Copy link

github-actions bot commented Jul 23, 2025

📦 Bundle size

Full ESM build, minified and gzipped.

Before After Diff
WebGL 338.37
78.9
338.37
78.9
+0 B
+0 B
WebGPU 561.4
155.29
561.52
155.31
+120 B
+22 B
WebGPU Nodes 560.01
155.05
560.13
155.07
+120 B
+21 B

🌳 Bundle size after tree-shaking

Minimal build including a renderer, camera, empty scene, and dependencies.

Before After Diff
WebGL 469.75
113.6
469.75
113.6
+0 B
+0 B
WebGPU 636.14
172.14
636.26
172.17
+120 B
+34 B
WebGPU Nodes 590.79
161.38
590.91
161.4
+120 B
+22 B

@sunag
Copy link
Collaborator

sunag commented Jul 24, 2025

The dispatchSizeOrCount property seems to no longer be compatible with its parameter overloading, it would be better to rename it, perhaps to just dispatchSize and update the js-doc.

Could you include for compute( dispatchSize) too?

@sunag sunag changed the title Introduce dispatchWorkgroupsIndirect WebGPURenderer: Introduce dispatchWorkgroupsIndirect Jul 24, 2025
@Spiri0
Copy link
Contributor Author

Spiri0 commented Jul 25, 2025

The idea of calling it dispatchSize again also occurred to me, it's simply more general for everything that contains the size, whether array or buffer.

Could you include for compute( dispatchSize) too?

Please excuse me, sometimes I'm a little slow on the uptake.
Do you mean whether I can implement it so that a compute writes its own dispatchSize?
This would lead to a read/write conflict in conjunction with dispatchWorkgroupsIndirect, which reads in the same pass.
So a compute cannot write its own dispatch. However, it could write its own dispatch for the next iteration into a buffer and this can be copied into its indirectDispach using copyBufferToBuffer before starting the next compute iteration.

@sunag
Copy link
Collaborator

sunag commented Jul 25, 2025

Hmm... I was referring to this: #31026

Maybe we can replace .compute( count ) with .compute( dispatchSize ) too. Since the parameter is overloaded

@Spiri0
Copy link
Contributor Author

Spiri0 commented Jul 29, 2025

The red line in the left image is the intersection of the dynamic ocean geometry with the camera's near plane. This intersection cannot be determined in one compute. However, it can be computed precisely and efficiently using 4 differnet computes. You can, of course, also work with returns in the individual shaders, but 10,000 returns when there's nothing left to do isn't as elegant as if the shader knew exactly how often it needs to run from the start with a dispatchBuffer from its predecessor compute shader.

image image

I've been wanting to create an underwater world for my ocean repo for a while, but I've always put it off because I simply didn't know how to efficiently implement the transition from above water to underwater. It's easy with the depth shader, but if you're exactly on the waterline in calm water, you won't see any depth. You can use tricks or calculate it precisely with ocean triangles camera near plane intersection. DispatchWorkgroupsIndirect is perfect for compute shader dispatch dependency chains and regressions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants