cuda generator emits invalid COO pack/unpack code

The command line parameter `-write-source=<filename>` writes several functions to the specified file.  This includes utility functions and macros, I think it is intended to write everything needed to run the compute kernel.  This is the only way I know of to get the `TACO_MIN` and `TACO_MAX` macros, for example.

When a GPU schedule is specified, the generated pack functions are wrong.

```sh
$ bin/taco 'C(i,j) = A(i,k) * B(k,j)' -f=b:ds -f=b:sd -s="split(i,i0,i1,2),split(j,j0,j1,2),split(k,k0,k1,2),reorder(i0,j0,k0,i1,j1,k1),parallelize(i0,GPUBlock,IgnoreRaces),parallelize(j0,GPUThread,Atomics)" -write-source=source.cu
$ nvcc --gpu-architecture=compute_61 -c -o source.o source.cu
source.cu(295): error: identifier "A_COO1_pos" is undefined

source.cu(299): error: identifier "A_COO1_crd" is undefined

source.cu(307): error: identifier "A_COO2_crd" is undefined

source.cu(308): error: identifier "A_COO_vals" is undefined

source.cu(336): error: identifier "B_COO1_pos" is undefined

source.cu(340): error: identifier "B_COO1_crd" is undefined

source.cu(348): error: identifier "B_COO2_crd" is undefined

source.cu(349): error: identifier "B_COO_vals" is undefined

source.cu(422): error: identifier "C_COO1_pos_ptr" is undefined

source.cu(423): error: identifier "C_COO1_crd_ptr" is undefined

source.cu(424): error: identifier "C_COO2_crd_ptr" is undefined

source.cu(425): error: identifier "C_COO_vals_ptr" is undefined

12 errors detected in the compilation of "source.cu".

```


The generated pack functions from the C codegen are correct.

```sh
$ bin/taco 'C(i,j) = A(i,k) * B(k,j)' -f=b:ds -f=b:sd -s="split(i,i0,i1,2),split(j,j0,j1,2),split(k,k0,k1,2),reorder(i0,j0,k0,i1,j1,k1),parallelize(i0,CPUThread,Atomics)" -write-source=source.c 
$ gcc -c -o source.o source.c  
```

The difference is that the `pack_A`, `pack_B` and `unpack` functions do not take the necessary parameters.

```sh
$ grep pack source.c source.cu | grep "int "
source.c:int pack_A(taco_tensor_t *A, int* A_COO1_pos, int* A_COO1_crd, int* A_COO2_crd, double* A_COO_vals) {
source.c:int pack_B(taco_tensor_t *B, int* B_COO1_pos, int* B_COO1_crd, int* B_COO2_crd, double* B_COO_vals) {
source.c:int unpack(int** C_COO1_pos_ptr, int** C_COO1_crd_ptr, int** C_COO2_crd_ptr, double** C_COO_vals_ptr, taco_tensor_t *C) {
source.cu:int pack_A(taco_tensor_t *A) {
source.cu:int pack_B(taco_tensor_t *B) {
source.cu:int unpack(taco_tensor_t *C) {
```

If I copy the parameter lists over from `source.c` to `source.cu`, the cuda version now builds successfully.

The full generated code can be found here: https://gist.github.com/Infinoid/a3f64f5b2c6a291f381d3274dd567d53

The schedules used in these examples come from [the scheduling.lowerSparseMatrixMul test case](https://github.com/tensor-compiler/taco/blob/master/test/tests-scheduling.cpp#L344-L352).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cuda generator emits invalid COO pack/unpack code #438

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

cuda generator emits invalid COO pack/unpack code #438

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions