Skip to content

Error in infering input dimensions during mapCube of a Dataset #516

@danlooo

Description

@danlooo

I want to call mapCube on all Variables of a Dataset within the same Zarr store at once, e.g. converting bands red, green, and blue in parallel. One can apply mapCube on each Array separately. However, they share some input and output dimensions so that I want to put them into the same Zarr Dataset store, writing the data directly to outdims while skipping additional copying of savedataset.

Unfortunately,

using YAXArrays
using DimensionalData
a = rand(X(1:10), Y(1:5)) |> x -> YAXArray(x.data)
b = rand(X(1:10), Y(1:5)) |> x -> YAXArray(x.data)
ds = Dataset(a=a, b=b)
res = mapCube(
    ds;
    indims=(InDims(), InDims()),
    outdims=OutDims(Ti(1:10); path=tempname(), backend=:zarr),
) do xin, xout
    xout .= 42
end

results into error:

ERROR: type Tuple has no field axisdesc
Stacktrace:
 [1] getproperty
   @ ./Base.jl:49 [inlined]
 [2] mapCube(::Function, ::Dataset; indims::Tuple{…}, outdims::OutDims, inplace::Bool, kwargs::@Kwargs{})
   @ YAXArrays.DAT ~/prj/YAXArrays.jl/src/DAT/DAT.jl:339
 [3] top-level scope
   @ REPL[12]:1

Notably, we get the same error after converting the Dataset into a tuple of YAXArrays:

using YAXArrays, Zarr
using YAXArrays: YAXArrays as YAX
using Dates

f(lo, la, t) = (lo + la + Dates.dayofyear(t))

function g(xout, lo, la, t)
    xout .= f.(lo, la, t)
end

lat_yax = YAXArray(lat(range(1, 10)))
lon_yax = YAXArray(lon(range(1, 15)))
tspan = Date("2022-01-01"):Day(1):Date("2022-01-30")
time_yax = YAXArray(YAX.time(tspan))

gen_cube = mapCube(g, (lon_yax, lat_yax, time_yax);
           indims = (InDims(), InDims(), InDims("time")),
           outdims = OutDims("time", overwrite=true, path="my_gen_cube.zarr", backend=:zarr,
           outtype = Float32)
       )
ds_t = Dataset(; r = lat_yax, g = lon_yax, t = time_yax )
gen_cube_ds = mapCube(g, ds_t;
    indims = (InDims(), InDims(), InDims("time")),
    outdims = OutDims("time", overwrite=true, path="my_gen_cube.zarr", backend=:zarr,
    outtype = Float32)
)

The corresponding method does not have unit tests.

Workaround

Create and save skeleton of dataset and fill it later with set index in parallel see YAXArrays and xarrays documentation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions