Add AnyCuDeviceArray variations and CuScalar #2849

moukle · 2025-08-22T14:49:42Z

Introduces CuScalar{T} = CuArray{T,0} for convenience.
Generalizes AnyCuDevice{Array,Scalar,Vector,VecOrMat} so they match on SubArrays as well.

This is useful when working with zero-dimensional CUDA arrays and makes type matching with AnyCuDevice* more ergonomic.

I haven’t added tests yet because I’m not sure if this direction is welcome.

github-actions · 2025-08-22T14:50:47Z

Your PR requires formatting changes to meet the project's style guidelines.
Please consider running Runic (git runic master) to apply these changes.

Click here to view the suggested changes.

diff --git a/src/array.jl b/src/array.jl
index 03078e002..19a9e7b7c 100644
--- a/src/array.jl
+++ b/src/array.jl
@@ -123,7 +123,7 @@ end
 
 ## convenience constructors
 
-const CuScalar{T} = CuArray{T,0}
+const CuScalar{T} = CuArray{T, 0}
 const CuVector{T} = CuArray{T,1}
 const CuMatrix{T} = CuArray{T,2}
 const CuVecOrMat{T} = Union{CuVector{T},CuMatrix{T}}
@@ -372,7 +372,7 @@ is_host(a::CuArray) = memory_type(a) == HostMemory
 
 export DenseCuArray, DenseCuVector, DenseCuMatrix, DenseCuVecOrMat,
        StridedCuArray, StridedCuVector, StridedCuMatrix, StridedCuVecOrMat,
-       AnyCuArray, AnyCuScalar, AnyCuVector, AnyCuMatrix, AnyCuVecOrMat
+    AnyCuArray, AnyCuScalar, AnyCuVector, AnyCuMatrix, AnyCuVecOrMat
 
 # dense arrays: stored contiguously in memory
 #
@@ -427,7 +427,7 @@ end
 
 # anything that's (secretly) backed by a CuArray
 const AnyCuArray{T,N} = Union{CuArray{T,N}, WrappedArray{T,N,CuArray,CuArray{T,N}}}
-const AnyCuScalar{T} = AnyCuArray{T,0}
+const AnyCuScalar{T} = AnyCuArray{T, 0}
 const AnyCuVector{T} = AnyCuArray{T,1}
 const AnyCuMatrix{T} = AnyCuArray{T,2}
 const AnyCuVecOrMat{T} = Union{AnyCuVector{T}, AnyCuMatrix{T}}
diff --git a/src/device/array.jl b/src/device/array.jl
index 2d3f4e350..82c123a52 100644
--- a/src/device/array.jl
+++ b/src/device/array.jl
@@ -33,18 +33,18 @@ struct CuDeviceArray{T,N,A} <: DenseArray{T,N}
         new(ptr, maxsize, dims, prod(dims))
 end
 
-const CuDeviceScalar{T} = CuDeviceArray{T,0,A} where {A}
-const CuDeviceVector{T} = CuDeviceArray{T,1,A} where {A}
-const CuDeviceMatrix{T} = CuDeviceArray{T,2,A} where {A}
+const CuDeviceScalar{T} = CuDeviceArray{T, 0, A} where {A}
+const CuDeviceVector{T} = CuDeviceArray{T, 1, A} where {A}
+const CuDeviceMatrix{T} = CuDeviceArray{T, 2, A} where {A}
 
 # anything that's (secretly) backed by a CuDeviceArray
 export AnyCuDeviceArray, AnyCuDeviceScalar, AnyCuDeviceVector, AnyCuDeviceMatrix, AnyCuDeviceVecOrMat
 
-const AnyCuDeviceArray{T,N} = Union{CuDeviceArray{T,N},WrappedArray{T,N,CuDeviceArray,CuDeviceArray{T,N,A}}} where {A}
-const AnyCuDeviceScalar{T} = AnyCuDeviceArray{T,0}
-const AnyCuDeviceVector{T} = AnyCuDeviceArray{T,1}
-const AnyCuDeviceMatrix{T} = AnyCuDeviceArray{T,2}
-const AnyCuDeviceVecOrMat{T} = Union{AnyCuDeviceVector{T},AnyCuDeviceMatrix{T}}
+const AnyCuDeviceArray{T, N} = Union{CuDeviceArray{T, N}, WrappedArray{T, N, CuDeviceArray, CuDeviceArray{T, N, A}}} where {A}
+const AnyCuDeviceScalar{T} = AnyCuDeviceArray{T, 0}
+const AnyCuDeviceVector{T} = AnyCuDeviceArray{T, 1}
+const AnyCuDeviceMatrix{T} = AnyCuDeviceArray{T, 2}
+const AnyCuDeviceVecOrMat{T} = Union{AnyCuDeviceVector{T}, AnyCuDeviceMatrix{T}}
 
 ## array interface

codecov · 2025-08-22T18:02:24Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.82%. Comparing base (c8c2142) to head (d4fb56f).
⚠️ Report is 15 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #2849      +/-   ##
==========================================
+ Coverage   89.64%   89.82%   +0.18%     
==========================================
  Files         150      150              
  Lines       13229    13232       +3     
==========================================
+ Hits        11859    11886      +27     
+ Misses       1370     1346      -24

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

maleadt · 2025-09-02T08:29:16Z

Introduces CuScalar{T} = CuArray{T,0} for convenience.

Is there precedence for this in the Julia ecosystem? AFAIK we normally use Ref for these, and we have CuRef/CuRefValue (which are also a bit cheaper than a full-fledged Array). 0-dimension arrays aren't that common in my experience, so I'm not sure if they're worth their own alias.

Generalizes AnyCuDevice{Array,Scalar,Vector,VecOrMat} so they match on SubArrays as well.

I guess that's fine, but beware that the Any* aliases are problematic and something I'd like to get rid of if possible. Personally, I haven't had the need to accurately type device-side functions like that. Could you share your motivation/use case?

moukle · 2025-09-02T09:26:44Z

Is there precedence for this in the Julia ecosystem?

There is ZeroDimensionalArrays.

AFAIK we normally use Ref for these, and we have CuRef/CuRefValue (which are also a bit cheaper than a full-fledged Array). 0-dimension arrays aren't that common in my experience, so I'm not sure if they're worth their own alias.

I tried to get this working but don't know how.

using CUDA

function kernel(x)
    x[] += 1
    return
end

arr = fill(0) |> cu
ref = CuRef{Int64}(0)

@cuda kernel(arr) # works
@cuda kernel(ref)

Argument 2 to your kernel function is of type CUDA.CuRefValue{Int64}, which is not a bitstype:
  .buf is of type CUDA.Managed{CUDA.DeviceMemory} which is not isbits.
    .stream is of type CuStream which is not isbits.
      .ctx is of type Union{Nothing, CuContext} which is not isbits.

When adding the adoptations.

using Adapt
Adapt.@adapt_structure CUDA.CuRefValue

@cuda kernel(ref)

ERROR: LoadError: CuRef only supports element types that are allocated inline.
CUDA.Managed{CUDA.DeviceMemory} is a mutable type

I guess that's fine, but beware that the Any* aliases are problematic and something I'd like to get rid of if possible. Personally, I haven't had the need to accurately type device-side functions like that. Could you share your motivation/use case?

I am doing Simulated Annealing and Parallel Tempering with multiple replicas.
Thus I pass sometimes a view of spins and sometimes not (when working with only one replica).

That's also where I use the Scalar dispatching. Looks something like this:

function sweep!(spins::AnyCuDeviceMatrix, energies::AnyCuDeviceVector, ...)
      replica = blockIdx().x
      x = @view spins[:,replica_idx]
      e = @view energies[replica_idx]

      sweep!(x, e, ...)
end

function sweep!(spins::AnyCuDeviceVector, energy::AnyCuDeviceScalar, ...)
      ...
end

Add AnyCuDeviceArray variations and CuScalar

d4fb56f

maleadt added cuda array Stuff about CuArray. speculative Not sure about this one yet. labels Sep 2, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add AnyCuDeviceArray variations and CuScalar #2849

Add AnyCuDeviceArray variations and CuScalar #2849

Uh oh!

moukle commented Aug 22, 2025

Uh oh!

github-actions bot commented Aug 22, 2025

Uh oh!

codecov bot commented Aug 22, 2025 •

edited

Loading

Uh oh!

maleadt commented Sep 2, 2025

Uh oh!

moukle commented Sep 2, 2025

Uh oh!

Uh oh!

Add AnyCuDeviceArray variations and CuScalar #2849

Are you sure you want to change the base?

Add AnyCuDeviceArray variations and CuScalar #2849

Uh oh!

Conversation

moukle commented Aug 22, 2025

Uh oh!

github-actions bot commented Aug 22, 2025

Uh oh!

codecov bot commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

maleadt commented Sep 2, 2025

Uh oh!

moukle commented Sep 2, 2025

Uh oh!

Uh oh!

codecov bot commented Aug 22, 2025 •

edited

Loading