Skip to content

Conversation

christiangnrd
Copy link
Member

This may have to wait for KA 0.10 depending on how much cpu=true affects performance.

@christiangnrd
Copy link
Member Author

Seems like at least with CUDA.jl, using dynamic workgroup sizes recovers ~50% of the performance lost switching over to KernelAbstractions. Is there potentially some overhead with KA that is lesser with Dynamic workgroup sizes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant