-
-
Notifications
You must be signed in to change notification settings - Fork 209
Introduce BlockSize
#3716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Introduce BlockSize
#3716
Conversation
bd90307
to
10cc79c
Compare
Looks very nice. Could we review the basic approach before you spend lots more time on it? |
Sure thing. Should be good to go as is and can be extended further when approved. One neat byproduct, that these changes would allow for, are non compile time sized operations on the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good.
Looks really neat.
|
For points 1 and 2 that should be no problem - how about: Regarding 3: the interface to retrieve the value (here |
I don't like relying on the compiler to inline things that we know are known at compile time. We have avoided this in the past and preferred being explicit over relying on the compiler and then not knowing what the compiler does. |
It would be best if the |
It think I have a fix: |
I have reviewed the PR and the above discussion and all comments have been addressed. |
I’d like to discuss this one more. Not quite convinced that it’s a simplification. |
cpp/dolfinx/fem/pack.h
Outdated
for (int k = 0; k < _bs; ++k) | ||
coeffs[pos_c + k] = v[pos_v + k]; | ||
} | ||
int bs = block_size(_bs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the point of passing BlockSize auto _bs
if the implementation using int bs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It allows for inovocations with both a compile time block size, of std::integral_constant
type, or a runtime one, of int
type. In both cases however we then extract an integral block size bs
which gets used for the computations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point is that int bs;
isn't constexpr
const auto [dmap1, _bs1, cells1] = dofmap1; | ||
|
||
int bs0 = block_size(_bs0); | ||
int bs1 = block_size(_bs1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't constexpr
, so we lose the compile-time 'const-ness'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We loose the constexpr int = ...
version. However in the case of a compile time value the block_size
callback has signature constexpr int block_size()
. Meaning the compiler will see int bs = constexpr int
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The block size get assigned to the runtime int
- it's not clear or guaranteed that the compiler can exploit that the block size is a compile time known.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type of the unpacked value is now changed to auto
. Therefore the type is inferred from the return type of the block_size
function, which is int
for the runtime case and constexpr int
for a compile time value.
This reverts commit 6900e88.
In performance critical parts some block sizes are optimized for by compiling explicit versions with the block size being provided as a compile time constant. At the same time general runtime block sizes are supported through an argument to these functions.
This causes
Introduces a
BlockSize
concept that either holds a runtimeint
or a compile timestd::integral_constant<int, bs>
which allows to generate code paths explicitly for certain sizes, while maintaining a shared code path in both cases.