Skip to content

RFC: Generating code to filter out explicit zeros in results #289

Open
@stephenchouca

Description

@stephenchouca

Summary

This RFC proposes adding a new padded property to mode formats, which describes whether or not explicit zeros are stored. The lowering machinery can use this property to determine whether/how to generate code that filters out explicit zeros in the result.

Motivation

Currently, an expression such as a(i) = b(i), where a is compressed and b is dense, is always lowered to the following code:

// Algorithm 1
int32_t pa1 = 0;
for (int32_t i = 0; i < b1_dimension; i++) {
    a_vals[pa1] = b_vals[i];
    pa1++;
}

which stores all components of b into a, including all the zero components. This is often not desirable, since the whole point of having compressed formats in the first place is to avoid needing to store zeros. Instead, we might want to generate the following code:

// Algorithm 2
int32_t pa1 = 0;
for (int32_t i = 0; i < b1_dimension; i++) {
    if (b_vals[i] != 0.0) {
        a_vals[pa1] = b_vals[i];
        pa1++;
    }
}

which only stores non-zero components of b into a. This is clearly beneficial if most components of b are zeros. However, having to filter out zeros incurs additional performance overhead, so if we know that b contains only (or even mostly) nonzeros then Algorithm 1 would actually be preferable. Thus, we want TACO to be able to generate both Algorithms 1 and 2.

Proposed Changes

I propose adding a new padded property to mode formats. A padded mode format may store explicit zero values (or subtensors), whereas an unpadded mode format only stores nonzero values (or subtensors). The lowering machinery can then use this property to determine whether (and how) to generate code that filters out explicit zeros in the result. For instance:

  • If a from the earlier example is stored in a padded version of the compressed mode format, then the result is allowed to store explicit zeros.
    Thus, the lowering machinery can emit Algorithm 1 for the vector assignment, since explicit zeros do not need to be filtered out.
  • If a is stored in an unpadded version of the compressed mode format and b is stored in a padded version of the dense mode format, then the input may contain explicit zeros but the result must not store explicit zeros.
    Thus, the lowering machinery must emit Algorithm 2 for the vector assignment in order to filter out explicit zeros.
  • If b is also stored in an unpadded version of the dense mode format though, then the lowering machinery can infer that the result will not contain zeros.
    Thus, the lowering machinery can again emit Algorithm 1 for the vector assignment, since there will not be explicit zeros that need to be filtered out.

For matrices and higher-order tensors, arbitrary combinations of padded and unpadded mode formats may be used to store tensor dimensions. CSR without explicit zeros, for instance, can be expressed as (dense, compressed) where the dense row dimension is padded (to indicate that empty rows may still be stored in the pos array) and the compressed column dimension is unpadded. By contrast, using padded compressed to store the column dimension describes CSR with explicit zeros, while using unpadded dense to store the row dimension describes a CSR matrix with no empty row.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions