Skip to content

Conversation

bmuddha
Copy link
Collaborator

@bmuddha bmuddha commented Mar 11, 2025

now we make use of tag based (de)serialization which allows us to evolve the schema of account with various new field types, while preserving backwards compatibility

now we make use of tag based (de)serialization which allows to evolve
the schema of account with various new field types while preserving
backwards compatibility
Copy link
Contributor

@taco-paco taco-paco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think this is a cool effort, I see a case for tag based de/serialization in case of registry, but the more I read implementation the more it reminds me of protobuf. That resemblance also highlights some missing parts from the implementation.

The important part that is missing in the implementation can be seen in this quote from protobuf encoding spec

When a message is encoded, each key-value pair is turned into a record consisting of the field number, a wire type and a payload. The wire type tells the parser how big the payload after it is. This allows old parsers to skip over new fields they don’t understand. This type of scheme is sometimes called Tag-Length-Value, or TLV.

Assume following example. For sake of simplicity I will use proto DSL.

message Origin {
   string country = 1;
   string city = 2;
}

message Person {
   int32 age = 1;
   Origin origin = 2;
   string name = 3
}

Now we update our registry contract and Origin now is:

message Origin {
   string country = 1; 
   string city = 2;
   string hood = 3;
}

The issue here that an older client once encountering unknown for him tag Origin::hood can't advance on a proper number of bytes. since it doesn't know the size/len of the hood. And if it's not done then Persion::name will contain hood value since their tags and types are the same.

Above is assuming that we skip unknown tags, which we should otherwise there's no backward compatibility anyway.

If we really want to go with tag based serialization, taking in account above our protocol would end up pretty much identical to protobuf, hence I would just suggest to use it since its well documented & battle tested solution.

We don't even need to introduce .proto files since we could just use post syntax to describe our types

#[derive(Clone, PartialEq, prost::Message)]
pub struct ValidatorInfo {
    #[prost(message,, tag = "1")]
    pub identity:  PublicKey,
    #[prost(message, tag = "2")]
    pub addr: Url,
    #[prost(u16, tag = "3")]
    pub block_time_ms: u16,
    // bla bla .....
}

slice[..2].copy_from_slice(&(self.0.len() as u16).to_le_bytes());
slice[2..2 + self.0.len()].copy_from_slice(self.0.as_ref());
}
fn deserialize(slice: &[u8]) -> Option<Self> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to use Result. Option doesn't carry enough information at what point deserialization broke apart.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, there's basically one point where deserialization might go wrong, length mismatch and that's it, creating an error just for that is redundant. I've added extra logging for debug purposes anyway

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could be various errors associated with data format. Say wrong encodings, or SockedAddrV4 that was sent as string and not as its memory representation, DateTime and various others

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change the signature once we have those types, for now it doesn't affect anything imho.

fn size(&self) -> usize {
std::mem::size_of::<Self>()
}
fn serialize(&self, slice: &mut [u8]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest for serialize to return Result as well, since data format may have restriction on a particular data type. Say, as was discussed there should be restriction on addr field size to 150.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with tag based approach and solana's built-in incentive to keep accounts small, we can allow for any length of fields, if you are willing to pay for huge accounts, so be it, we won't stop you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally there could be various issue with encoding during the serialization, for example, a filesystem Path may refuse to serialize itself if it contains invalid UTF-8 data.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but we are working with raw bytes anyway, you can send us invalid url for example, and we happily will store it for you, because when clients will try to get that url from bytes, something like from_utf8 will fail and they won't use your endpoint, so it's your loss.
All in all you have a point, but currently let's not introduce extra complexity for future hypothetical issues, which we might never encounter.


macro_rules! extract {
($field: ident) => {
builder.$field.ok_or(ProgramError::InvalidAccountData)?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't allow for compatibility with an older client. Will return error if field is missing, while should choose a default value

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is, that new fields are always serialized after old ones, i.e. if we add a new field it will be laid out at the end of byte string, so older clients can stop deserializing once they hit an unknown field, as everything afterwards is new stuff, hence the break statement in builder loop https://github.com/magicblock-labs/magic-domain-program/pull/6/files#diff-4ee356d91730bc0561e18e949120c3eb25918ed5b7bf1bc89e2f7ed4c0f13503R119

Basically, by the time you encounter an unserializable field you already have all the fields you know of and can proceed with your older logic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could implement it this way, but notice that we call RecordBuilder::populate(buffer, true) everywhere, hence we throw an error

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, by the time you encounter an unserializable field you already have all the fields you know of and can proceed with your older logic.

Here I would refer to this part of my review

The issue here that an older client once encountering unknown for him tag Origin::hood can't advance on a proper number of bytes. since it doesn't know the size/len of the hood. And if it's not done then Persion::name will contain hood value since their tags and types are the same.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, that's a good point, but I kinda doubt that will be using composite types here

tags::ADDR_TAG => {
builder.addr = deserializer.read_field();
}
_ if error => return Err(ProgramError::InvalidAccountData),
Copy link
Contributor

@taco-paco taco-paco Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since deserialize functions use error = true, on an older client once it will see new unknown(for it) tag it will return error, while it should just ignore it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Older clients can also not use error = true, it's up to them whether they want to error or not, I'll add clarifying doc comments

Copy link
Contributor

@taco-paco taco-paco Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the issue here is that they would have to copy-paste an implementation to change the flag, since this is hardcoded for now on contract side. Also this would be quite painful to version and maintain.

builder.addr = deserializer.read_field();
}
_ if error => return Err(ProgramError::InvalidAccountData),
_ => break,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_ => break,
_ => continue,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In tag deserialization such fields usually ignored.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to ignore them we have to, as you pointed out use tag-length encoding, which adds extra complexity and space requirement. Instead we layout the fields in strict chronological order of their introduction, thus once you encounter unknown tag, you are done, there's nothing else beyond that you know of, and most likely (considering you are using our SDK) you already have all the fields that your code is aware of.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this doesn't allow for complex types at all. Assume:

struct ValidatorInfo {
   // ... bla bla some feels
   
   pub region: Region
   pub block_time_ms: u32
}

struct Regoin {
    pub continent: Continent
}

enum Continent {
 ...
}

but the we decide do match validators based on country

struct Regoin {
    pub continent: Continent
   pub country: String
}

The above will break everything

@taco-paco
Copy link
Contributor

We don't even need to introduce .proto files since we could just use post syntax to describe our types

It also looks like some solana programs took similar turn

@bmuddha
Copy link
Collaborator Author

bmuddha commented Mar 12, 2025

I'd like to avoid reliance on proto integration at all costs. First, we only have one type which we serialize, unless we have 10 more, the usage of schema based serde is unjustified. Second we are talking about smart contract, and we should keep them as nimble as possible, avoiding any extra code bloat and runtime cost increase.

@taco-paco
Copy link
Contributor

I'd like to avoid reliance on proto integration at all costs

That's why I proposed that we could just use prost, we wouldn't use protoc at all since we don't need compilation to anything else other then rust

First, we only have one type which we serialize, unless we have 10 more.

But the idea of the PR implies that our types and Record fields will grow, hence we need to ensure correct work of de/serialization.

the usage of schema based serde is unjustified

Not sure what you mean here. The proposed approach uses the same tags as protobuf does.

Second we are talking about smart contract, and we should keep them as nimble as possible, avoiding any extra code bloat and runtime cost increase.

both borsh and prost reasonably small, we can test here once implementation is polished, but I'd say the binary sizes would be pretty similiar

@taco-paco
Copy link
Contributor

Anyway I think here we could try to play with prost at least for sake of comparison of binary sizes, which maybe I can try to do. But it also would be good to hear other opinions on this matter @GabrielePicco @thlorenz

@thlorenz
Copy link

Interesting related post.

IMHO we should stick with serializers commonly used in Solana programs, either borsh or a more efficient one that we use already in the delegation program.

We should try to use one or two of those options across all of our programs so developers onboarding don't have to learn a different serializer each time.

Afaik prost is just a library to generate rust code for protobuf types and if we don't want to use protobuf we shouldn't use prost either. I have never seen protobuf used inside a Solana program even though the benchmark linked above includes a flatbuffer case but also for that I have never seen a real world use in a program.

@thlorenz
Copy link

thlorenz commented Mar 12, 2025

Also to consider types growing or types changing we can nest them in enums for instance have a V1 with specific fields, a V2 with added fields a V3 and so on and then the discriminator of that enum will automatically figure out how to deserialize a specific account.

This is how it is commonly done in Solana I've seen this for instance at metaplex.

The nice thing about this is that deserialization will automatically figure out which version to use and in the rest of the code you just match on the enum to get the correct fields out of whichever version of the account you are using.

Essentially the enum discriminator functions as a tag in that case.

@bmuddha
Copy link
Collaborator Author

bmuddha commented Mar 12, 2025

The nice thing about this is that deserialization will automatically figure out which version to use and in the rest of the code you just match on the enum to get the correct fields out of whichever version of the account you are using.

it's not backwards compatible, if you introduce v3 for example, while you client code still stuck with v2, then there's no way they can deserialize new version of your enum.

@thlorenz
Copy link

The idea is that you update the client as well whenever a new version of account is added or a new version of the program is released in general.
Client SDKs are usually generated from idls or similar so the upgrade is very fast.
Then you just publish a minor version upgrade and all clients that have the version range set to automatically get minor upgrades will get it.

That is very common and how it is normally done and usually not a problem.
Especially with programs that are not user facing I don't see this as an issue at all.

It is a much simpler and robust solution than the tagging approach with serde in my opinion.

@bmuddha
Copy link
Collaborator Author

bmuddha commented Mar 13, 2025

But that implies that you'll be forcing the client to upgrade, and if they don't (because they were not following the newsletter for example) then their code will suddenly stop working, which is not ideal.

@bmuddha bmuddha closed this Mar 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants