Skip to content

allow hash() to take multiple arguments similar to pack() #1526

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 1 addition & 13 deletions libraries/chain/include/eosio/chain/action.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -96,20 +96,8 @@ namespace eosio { namespace chain {
};

inline digest_type generate_action_digest(const action& act, const vector<char>& action_output) {
std::array<digest_type,2> hashes;
const action_base& base = act;

fc::sha256::encoder enc;
fc::raw::pack(enc, base);
hashes[0] = enc.result();

enc.reset();
fc::raw::pack(enc, act.data, action_output);
hashes[1] = enc.result();

enc.reset();
fc::raw::pack(enc, hashes);
return enc.result();
return fc::sha256::hash(fc::sha256::hash(base), fc::sha256::hash(act.data, action_output));
}

} } /// namespace eosio::chain
Expand Down
29 changes: 29 additions & 0 deletions libraries/libfc/include/fc/crypto/hash_concepts.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#pragma once

#include <tuple>
#include <type_traits>
#include <cstdint>

namespace fc {

template<typename Arg0>
concept FirstArgDecaysToCharPtr =
std::is_same_v<std::decay_t<Arg0>, char*> ||
std::is_same_v<std::decay_t<Arg0>, const char*>;

template<typename Arg1>
concept SecondArgIsConvertibleToUint32 =
std::is_convertible_v<std::decay_t<Arg1>, uint32_t>;

/* Used on template<typename... T> hash_type hash(const T&... t); to prevent calls such as
* std::vector<char> foo;
* hash(foo.data(), foo.size())
* calling the variadic hash instead of the existing hash(const char* d, uint32_t dlen)
*/
template<typename... T>
concept NotTwoArgsCharUint32 =
sizeof...(T) != 2 ||
!FirstArgDecaysToCharPtr<std::tuple_element_t<0, std::tuple<T...>>> ||
!SecondArgIsConvertibleToUint32<std::tuple_element_t<1, std::tuple<T...>>>;

}
10 changes: 6 additions & 4 deletions libraries/libfc/include/fc/crypto/ripemd160.hpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
#pragma once

#include <fc/fwd.hpp>
#include <fc/io/raw_fwd.hpp>
#include <fc/crypto/hash_concepts.hpp>
#include <fc/io/raw.hpp>
#include <fc/reflect/typename.hpp>

namespace fc{
Expand All @@ -25,11 +26,12 @@ class ripemd160
static ripemd160 hash( const char* d, uint32_t dlen );
static ripemd160 hash( const std::string& );

template<typename T>
static ripemd160 hash( const T& t )
template<typename... T>
requires NotTwoArgsCharUint32<T...>
static ripemd160 hash( const T&... t )
{
ripemd160::encoder e;
fc::raw::pack(e,t);
fc::raw::pack(e,t...);
return e.result();
}

Expand Down
9 changes: 6 additions & 3 deletions libraries/libfc/include/fc/crypto/sha1.hpp
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
#pragma once
#include <fc/fwd.hpp>
#include <fc/string.hpp>
#include <fc/crypto/hash_concepts.hpp>
#include <fc/io/raw.hpp>

namespace fc{

Expand All @@ -20,11 +22,12 @@ class sha1
static sha1 hash( const char* d, uint32_t dlen );
static sha1 hash( const std::string& );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I think we should rename these to hash_str and remove the complicated template concepts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe safest to rename the variadic one to hash_values so it is clear which one is in-use.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be a viable option; I was reluctant to make changes across existing code but I can try and see just how pervasive those changes would be. (many tests did fail without the complicated concepts though it's possible only due to a small number root causes)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, you could just add:

   template<typename T1, typename T2, typename... T>
   static sha256 hash( const T1& t1, const T2& t2, const T&... t )
    {
      sha256::encoder e;
      fc::raw::pack(e, t1, t2, t...);
      return e.result();
    }

to support the case a hash called with two or more parameters. I believe it would require no change to existing code or new concepts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that solves the problem though. Need std::vector .data() and .size() to go through the 'old' overload. e.g.

#include <iostream>
#include <vector>
#include <string>
#include <cstddef>
#include <cstdint>
#include <typeinfo>

struct sha256 {};

struct hasher {
    static sha256 hash(const char* d, uint32_t dlen) {
        std::cout << "CALLED: hash(const char*, uint32_t)" << std::endl;
        return {};
    }

    static sha256 hash(const std::string& s) {
        std::cout << "CALLED: hash(const std::string&)" << std::endl;
        return {};
    }

    static sha256 hash(const sha256& h) {
        std::cout << "CALLED: hash(const sha256&)" << std::endl;
        return {};
    }

    template<typename T1, typename T2, typename... T>
    static sha256 hash( const T1& t1, const T2& t2, const T&... t )
    {
        std::cout << "CALLED: Variadic hash"<< std::endl;
        return {};
    }
};

int main() {
    std::vector<char> wif_bytes = {'t', 'e', 's', 't'};
    hasher::hash(wif_bytes.data(), wif_bytes.size());
    return 0;
}

when running this get CALLED: Variadic hash but we need CALLED: hash(const char*, uint32_t)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly the std::is_convertible_v<std::decay_t<T>, size_t> is too loose, since even an fc::unsigned_int will match it (due to it having operator uint32_t()), so you can end up with a nasty surprise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a valid concern for the CStyleStringPtr version.

In my comment I was more worried about the variadic hash version (hash( const T&... t )) catching unexpected calls. One way to check that it doesn't happen would be to add an assert(0) in its implementation, and run the ci/cd on code that is not yet using the variadic version. If the asserton build passes, it will be a good indication that existing uses of hash didn't migrate to using the variadic version.

Upon reflection, I think the pattern of hash accepting a char * + length is not a good one, it probably would have been better to always have a single parameter to specify an object to be hashed, whether it is

  • a char* ( expected to be null-terminated)
  • a std::string or std::string_view
  • a vector<T> where T is a hashable type (or maybe any iterable container where the value_type is hashable)
  • a std::span<T> where T is a hashable type
  • etc...

which would avoid the ambiguity of a two argument hash call specifying one or two values to be hashed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can all be avoided by renaming the existing string ones to hash_chars()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does increasingly feel like renaming the old overloads is the best option, provided there is not too much fallout

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the number of calls to the non-templated hash() was substantially more than may have guessed. I prepared a commit where I replace all them to (I hope; didn't audit yet) 1:1 equivalent of new hash_raw(), c893000

But a substantial number of these don't need to be 1:1 exact as they were before (like in the benchmark, or many/most of tests), so I guess I can build on top of the commit a new commit that migrates some back to plain hash() to limit the number of calls hash_raw()


template<typename T>
static sha1 hash( const T& t )
template<typename... T>
requires NotTwoArgsCharUint32<T...>
static sha1 hash( const T&... t )
{
sha1::encoder e;
e << t;
fc::raw::pack(e,t...);
return e.result();
}

Expand Down
10 changes: 6 additions & 4 deletions libraries/libfc/include/fc/crypto/sha224.hpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
#pragma once
#include <unordered_map>
#include <fc/fwd.hpp>
#include <fc/io/raw_fwd.hpp>
#include <fc/crypto/hash_concepts.hpp>
#include <fc/io/raw.hpp>
#include <fc/string.hpp>

namespace fc
Expand All @@ -23,11 +24,12 @@ class sha224
static sha224 hash( const char* d, uint32_t dlen );
static sha224 hash( const std::string& );

template<typename T>
static sha224 hash( const T& t )
template<typename... T>
requires NotTwoArgsCharUint32<T...>
static sha224 hash( const T&... t )
{
sha224::encoder e;
fc::raw::pack(e,t);
fc::raw::pack(e,t...);
return e.result();
}

Expand Down
12 changes: 7 additions & 5 deletions libraries/libfc/include/fc/crypto/sha256.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,14 @@
#include <fc/fwd.hpp>
#include <fc/string.hpp>
#include <fc/platform_independence.hpp>
#include <fc/io/raw_fwd.hpp>
#include <fc/crypto/hash_concepts.hpp>
#include <fc/io/raw.hpp>
#include <boost/functional/hash.hpp>

namespace fc
{

class sha256
class sha256
{
public:
sha256();
Expand All @@ -36,11 +37,12 @@ class sha256
static sha256 hash( const std::string& );
static sha256 hash( const sha256& );

template<typename T>
static sha256 hash( const T& t )
template<typename... T>
requires NotTwoArgsCharUint32<T...>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The various hash types have some existing overloads of hash() such as

static sha256 hash( const char* d, uint32_t dlen );
static sha256 hash( const std::string& );
static sha256 hash( const sha256& );

This is pretty gnarly on its own because the std::string overload doesn't follow the same rules as if sending a std::string through the encoder. i.e.

   sha256 a = sha256::hash(std::string("banana"));
   sha256::encoded enc;
   fc::raw::pack(enc, std::string("banana"));
   sha256 b = enc.result();
   // a != b

but the real problem is the hash( const char* d, uint32_t dlen ) overload that will be used, for example,

std::vector<char> foo;
sha256::hash(foo.data(), foo.size())

I need to ensure that calls like sha256::hash(foo.data(), foo.size()) continue to go to the existing overload, so I need to remove the variadic function from consideration in such cases (it is picked without this).

static sha256 hash( const T&... t )
{
sha256::encoder e;
fc::raw::pack(e,t);
fc::raw::pack(e,t...);
return e.result();
}

Expand Down
21 changes: 16 additions & 5 deletions libraries/libfc/include/fc/crypto/sha3.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
#include <fc/fwd.hpp>
#include <fc/string.hpp>
#include <fc/platform_independence.hpp>
#include <fc/io/raw_fwd.hpp>
#include <fc/io/raw.hpp>
#include <boost/functional/hash.hpp>

namespace fc
Expand Down Expand Up @@ -32,12 +32,23 @@ class sha3
static sha3 hash(const std::string& s, bool is_nist=true) { return hash(s.c_str(), s.size(), is_nist); }
static sha3 hash(const sha3& s, bool is_nist=true) { return hash(s.data(), sizeof(s._hash), is_nist); }

template <typename T>
static sha3 hash(const T &t, bool is_nist=true)
struct keccak {};
struct nist {};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There didn't appear to be any existing usages of sha3::hash(). I opted to use tag dispatching to select between the two sha3 options and still have multiple arguments. The downside with this is that can't use sha3's hash() somewhere the hash type itself is templated. e.g.

template <typename HashType>
HashType do_something(unsigned x) {
   return HashType::hash(x);
}


template<typename... T>
static sha3 hash( keccak, const T&... t )
{
sha3::encoder e;
fc::raw::pack(e,t...);
return e.result(false);
}

template<typename... T>
static sha3 hash( nist, const T&... t )
{
sha3::encoder e;
fc::raw::pack(e, t);
return e.result(is_nist);
fc::raw::pack(e,t...);
return e.result(true);
}

class encoder
Expand Down
9 changes: 6 additions & 3 deletions libraries/libfc/include/fc/crypto/sha512.hpp
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
#pragma once
#include <fc/fwd.hpp>
#include <fc/string.hpp>
#include <fc/crypto/hash_concepts.hpp>
#include <fc/io/raw.hpp>

namespace fc
{
Expand All @@ -21,11 +23,12 @@ class sha512
static sha512 hash( const char* d, uint32_t dlen );
static sha512 hash( const std::string& );

template<typename T>
static sha512 hash( const T& t )
template<typename... T>
requires NotTwoArgsCharUint32<T...>
static sha512 hash( const T&... t )
{
sha512::encoder e;
e << t;
fc::raw::pack(e,t...);
return e.result();
}

Expand Down
2 changes: 2 additions & 0 deletions libraries/libfc/include/fc/io/raw.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -176,13 +176,15 @@ namespace fc {
} FC_RETHROW_EXCEPTIONS( warn, "fc::array<${type},${length}>", ("type",fc::get_typename<T>::name())("length",N) ) }

template<typename Stream, typename T, size_t N>
requires (!std::is_same_v<std::remove_cv_t<T>, char>)
inline void pack( Stream& s, T (&v)[N]) {
fc::raw::pack( s, unsigned_int((uint32_t)N) );
for (uint64_t i = 0; i < N; ++i)
fc::raw::pack(s, v[i]);
}

template<typename Stream, typename T, size_t N>
requires (!std::is_same_v<std::remove_cv_t<T>, char>)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When changing from raw_fwd.hpp to raw.hpp (a required change afaict), I would get ambiguous call errors between the T(&v)[N] (only present in raw.hpp) and char* overloads, due to a call such as sha256::hash("foo"). This seems like an appropriate change to keep C strings using the char* overload.

inline void unpack( Stream& s, T (&v)[N])
{ try {
unsigned_int size; fc::raw::unpack( s, size );
Expand Down
Loading