Skip to content

Conversation

ibraheemdev
Copy link
Member

@ibraheemdev ibraheemdev commented Aug 8, 2025

The overall design is laid out in this document. All the new functionality is gated under the persistence feature. However, this PR has a couple of limitations:

  • All transitive dependencies are forced to be serializable.
  • There is no easy way to update inputs after they are deserialized.
  • Query hashing is not implemented.
  • Accumulators and supertypes are not serializable.

Copy link

netlify bot commented Aug 8, 2025

Deploy Preview for salsa-rs canceled.

Name Link
🔨 Latest commit 4103798
🔍 Latest deploy log https://app.netlify.com/projects/salsa-rs/deploys/6896b522cda9d500081f1913

@ibraheemdev ibraheemdev force-pushed the ibraheem/persistent-caching branch from 8931c07 to caf7925 Compare August 8, 2025 03:43
deserialize_fn: Some(deserialize_fn),
}) => Some(quote! { (#serialize_fn, #deserialize_fn) }),
// Fallback to the `{Serialize, Deserialize}` implementations tuples if only one of the
// two is provided.
Copy link
Member Author

@ibraheemdev ibraheemdev Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a little bit of weirdness here in that all of the following are accepted:

#[salsa::tracked(serialize)]
#[salsa::tracked(deserialize)]
#[salsa::tracked(serialize, deserialize)]
#[salsa::tracked(serialize = ...)]
#[salsa::tracked(deserialize = ...)]
#[salsa::tracked(serialize = ..., deserialize = ...)]
struct ...;

But I'm not sure disallowing any of those is better. With the derive-macro API, we should be able to get rid of this anyways and just use serde::{Serialize, Deserialize} directly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this into a persist attribute to make things more consistent:

#[salsa::tracked(persist)]
#[salsa::tracked(persist(serialize = ...))]
#[salsa::tracked(persist(deserialize = ...))]
#[salsa::tracked(persist(serialize = ..., deserialize = ...))]
struct ...;

We don't accept the serialize and deserialize arguments on tracked functions because both the input and output types need to implement serde::{Serialize, Deserialize}, and the arguments would be ambiguous.

Copy link

codspeed-hq bot commented Aug 8, 2025

CodSpeed Performance Report

Merging #967 will improve performances by 8.88%

Comparing ibraheemdev:ibraheem/persistent-caching (4103798) with master (940f9c0)

Summary

⚡ 3 improvements
✅ 9 untouched benchmarks

Benchmarks breakdown

Benchmark BASE HEAD Change
amortized[Input] 2.3 µs 2.1 µs +7.39%
amortized[InternedInput] 2.3 µs 2.1 µs +8.88%
new[InternedInput] 4.2 µs 4 µs +5.43%

@ibraheemdev ibraheemdev force-pushed the ibraheem/persistent-caching branch 5 times, most recently from 77b1f21 to 15a3110 Compare August 8, 2025 04:08
src/database.rs Outdated
let DeserializeIngredients(zalsa) = self;

while let Some(name) = access.next_key::<&str>()? {
// TODO: Serialize the ingredient index directly.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically the debug name is not a unique identifier, but I found it more helpful for debugging than the ingredient index. It could also just be a field on each ingredient.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A separate (maybe even optinal) field would make sense to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the debug name for now, it's unfortunately very annoying to add extra fields because serde's macros aren't compatible with DeserializeSeed (which we need for some fields, because we deserialize into the table directly).

Copy link
Contributor

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. I didn't read line by line and instead focused on the overall design.

What's your plan with this PR? Do you want to land it behind a persistence feature or do you want to keep iterating in a separate branch?

src/database.rs Outdated
Comment on lines 205 to 210
// Ensure structs are serialized before tracked functions, as deserializing a
// memo requires its input struct to have been deserialized.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine deferring this to later but that will mean that we always serialize all tracked structs even if some query instances are never serialized (that created those tracked structs)

src/database.rs Outdated
let DeserializeIngredients(zalsa) = self;

while let Some(name) = access.next_key::<&str>()? {
// TODO: Serialize the ingredient index directly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A separate (maybe even optinal) field would make sense to me.

Comment on lines 324 to 327
<dyn salsa::Database>::deserialize(
&mut db,
&mut serde_json::Deserializer::from_str(&serialized),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does deserialize bump the revision to 2 to force a validation of all queries?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm no, should it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are a few things to consider here

  • memos and other ingredients that weren't verified in the last revision (current when deserializing) still need to be dirty. The easiest here might be to only serialize ingredients that are verified in the current revision and we revisit this later
  • we need to bump the revision if any input changes during deserialization. Now, we don't support this yet so it's fine to ignore for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memos and other ingredients that weren't verified in the last revision (current when deserializing) still need to be dirty. The easiest here might be to only serialize ingredients that are verified in the current revision and we revisit this later

We serialize and deserialize the current revision of the database, so I don't think this is a problem. Any ingredients that weren't verified in the last revision will have to be verified after deserialization.

@ibraheemdev ibraheemdev force-pushed the ibraheem/persistent-caching branch from d30ee9d to 5e1a984 Compare August 8, 2025 21:06
@ibraheemdev ibraheemdev force-pushed the ibraheem/persistent-caching branch from 7c469e0 to 03377ef Compare August 8, 2025 21:15
@ibraheemdev
Copy link
Member Author

I gated everything under a persistence feature. We should rework CI to use something like cargo-hack to test all the new feature permutations, but I'll do that in a separate PR.

@ibraheemdev ibraheemdev force-pushed the ibraheem/persistent-caching branch from 206d738 to 76a794b Compare August 8, 2025 22:35
@ibraheemdev ibraheemdev force-pushed the ibraheem/persistent-caching branch from 76a794b to b7e5658 Compare August 8, 2025 22:36
@ibraheemdev ibraheemdev force-pushed the ibraheem/persistent-caching branch from fecc37d to 4103798 Compare August 9, 2025 02:40
@ibraheemdev ibraheemdev added this pull request to the merge queue Aug 11, 2025
Merged via the queue into salsa-rs:master with commit 1ffb32f Aug 11, 2025
12 checks passed
@ibraheemdev ibraheemdev deleted the ibraheem/persistent-caching branch August 11, 2025 19:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants