-
Notifications
You must be signed in to change notification settings - Fork 13
allow hash()
to take multiple arguments similar to pack()
#1526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
ensure existing hash(const char* d, uint32_t dlen) takes precedence
inline void pack( Stream& s, T (&v)[N]) { | ||
fc::raw::pack( s, unsigned_int((uint32_t)N) ); | ||
for (uint64_t i = 0; i < N; ++i) | ||
fc::raw::pack(s, v[i]); | ||
} | ||
|
||
template<typename Stream, typename T, size_t N> | ||
requires (!std::is_same_v<std::remove_cv_t<T>, char>) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When changing from raw_fwd.hpp
to raw.hpp
(a required change afaict), I would get ambiguous call errors between the T(&v)[N]
(only present in raw.hpp
) and char*
overloads, due to a call such as sha256::hash("foo")
. This seems like an appropriate change to keep C strings using the char*
overload.
template<typename T> | ||
static sha256 hash( const T& t ) | ||
template<typename... T> | ||
requires NotTwoArgsCharUint32<T...> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The various hash types have some existing overloads of hash()
such as
spring/libraries/libfc/include/fc/crypto/sha256.hpp
Lines 35 to 37 in ac0abff
static sha256 hash( const char* d, uint32_t dlen ); | |
static sha256 hash( const std::string& ); | |
static sha256 hash( const sha256& ); |
This is pretty gnarly on its own because the
std::string
overload doesn't follow the same rules as if sending a std::string
through the encoder
. i.e.
sha256 a = sha256::hash(std::string("banana"));
sha256::encoded enc;
fc::raw::pack(enc, std::string("banana"));
sha256 b = enc.result();
// a != b
but the real problem is the hash( const char* d, uint32_t dlen )
overload that will be used, for example,
std::vector<char> foo;
sha256::hash(foo.data(), foo.size())
I need to ensure that calls like sha256::hash(foo.data(), foo.size())
continue to go to the existing overload, so I need to remove the variadic function from consideration in such cases (it is picked without this).
template <typename T> | ||
static sha3 hash(const T &t, bool is_nist=true) | ||
struct keccak {}; | ||
struct nist {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There didn't appear to be any existing usages of sha3::hash()
. I opted to use tag dispatching to select between the two sha3 options and still have multiple arguments. The downside with this is that can't use sha3's hash() somewhere the hash type itself is templated. e.g.
template <typename HashType>
HashType do_something(unsigned x) {
return HashType::hash(x);
}
@@ -20,11 +22,12 @@ class sha1 | |||
static sha1 hash( const char* d, uint32_t dlen ); | |||
static sha1 hash( const std::string& ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I think we should rename these to hash_str
and remove the complicated template concepts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe safest to rename the variadic one to hash_values
so it is clear which one is in-use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be a viable option; I was reluctant to make changes across existing code but I can try and see just how pervasive those changes would be. (many tests did fail without the complicated concepts though it's possible only due to a small number root causes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, you could just add:
template<typename T1, typename T2, typename... T>
static sha256 hash( const T1& t1, const T2& t2, const T&... t )
{
sha256::encoder e;
fc::raw::pack(e, t1, t2, t...);
return e.result();
}
to support the case a hash
called with two or more parameters. I believe it would require no change to existing code or new concepts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that solves the problem though. Need std::vector .data() and .size() to go through the 'old' overload. e.g.
#include <iostream>
#include <vector>
#include <string>
#include <cstddef>
#include <cstdint>
#include <typeinfo>
struct sha256 {};
struct hasher {
static sha256 hash(const char* d, uint32_t dlen) {
std::cout << "CALLED: hash(const char*, uint32_t)" << std::endl;
return {};
}
static sha256 hash(const std::string& s) {
std::cout << "CALLED: hash(const std::string&)" << std::endl;
return {};
}
static sha256 hash(const sha256& h) {
std::cout << "CALLED: hash(const sha256&)" << std::endl;
return {};
}
template<typename T1, typename T2, typename... T>
static sha256 hash( const T1& t1, const T2& t2, const T&... t )
{
std::cout << "CALLED: Variadic hash"<< std::endl;
return {};
}
};
int main() {
std::vector<char> wif_bytes = {'t', 'e', 's', 't'};
hasher::hash(wif_bytes.data(), wif_bytes.size());
return 0;
}
when running this get CALLED: Variadic hash
but we need CALLED: hash(const char*, uint32_t)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly the std::is_convertible_v<std::decay_t<T>, size_t>
is too loose, since even an fc::unsigned_int
will match it (due to it having operator uint32_t()
), so you can end up with a nasty surprise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a valid concern for the CStyleStringPtr
version.
In my comment I was more worried about the variadic hash version (hash( const T&... t )
) catching unexpected calls. One way to check that it doesn't happen would be to add an assert(0)
in its implementation, and run the ci/cd on code that is not yet using the variadic version. If the asserton
build passes, it will be a good indication that existing uses of hash
didn't migrate to using the variadic version.
Upon reflection, I think the pattern of hash
accepting a char *
+ length
is not a good one, it probably would have been better to always have a single parameter to specify an object to be hashed, whether it is
- a
char*
( expected to be null-terminated) - a
std::string
orstd::string_view
- a
vector<T>
where T is a hashable type (or maybe any iterable container where thevalue_type
is hashable) - a
std::span<T>
where T is a hashable type - etc...
which would avoid the ambiguity of a two argument hash
call specifying one or two values to be hashed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That can all be avoided by renaming the existing string ones to hash_chars()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does increasingly feel like renaming the old overloads is the best option, provided there is not too much fallout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the number of calls to the non-templated hash() was substantially more than may have guessed. I prepared a commit where I replace all them to (I hope; didn't audit yet) 1:1 equivalent of new hash_raw()
, c893000
But a substantial number of these don't need to be 1:1 exact as they were before (like in the benchmark, or many/most of tests), so I guess I can build on top of the commit a new commit that migrates some back to plain hash()
to limit the number of calls hash_raw()
This change allows passing
hash()
multiple arguments to concatenate together similar to howpack()
already does. This can allow some more concise usage ofhash()
without resorting tomake_tuple/pair()
and remembering to avoid the trap of needing to usecref
along withmake_tuple/pair()
to avoid copies.I updated
generate_action_digest()
as a good demonstration of the improved conciseness. Of course there are other locations we could potentially change too.Unfortunately there were a few gnarly quirks that came out of adding just this "simple" variadic template that I will describe below. If it is too gnarly (i.e. too risky) we can explore other approaches that aren't so invasive. For example I have a different branch main...vardigest that just creates a new variadic
digest()
call so as not to conflict with any existinghash()
usages at all.