[Testing Only] cacheless hamt iteration #2215

hanabi1224 · 2025-09-02T02:59:00Z

Benchmarks

➜  hamt git:(master) cargo bench
...
HAMT bulk insert (no flush)
                        time:   [7.0920 µs 7.1164 µs 7.1423 µs]
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  6 (6.00%) high mild

HAMT bulk insert with flushing and loading
                        time:   [2.0349 ms 2.0411 ms 2.0475 ms]
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

HAMT deleting all nodes time:   [106.66 µs 107.20 µs 107.73 µs]
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

HAMT for_each function  time:   [102.49 µs 102.76 µs 103.08 µs]
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  1 (1.00%) high severe

➜  hamt git:(hm/hamt-cacheless-iter) cargo bench
...
HAMT bulk insert (no flush)
                        time:   [7.0839 µs 7.1202 µs 7.1733 µs]
                        change: [-0.3421% +0.2993% +1.0198%] (p = 0.42 > 0.05)
                        No change in performance detected.
Found 16 outliers among 100 measurements (16.00%)
  2 (2.00%) low severe
  7 (7.00%) low mild
  6 (6.00%) high mild
  1 (1.00%) high severe

HAMT bulk insert with flushing and loading
                        time:   [1.9895 ms 1.9947 ms 2.0004 ms]
                        change: [-2.6603% -2.2702% -1.8510%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) high mild
  2 (2.00%) high severe

HAMT deleting all nodes time:   [104.54 µs 104.86 µs 105.21 µs]
                        change: [-1.9633% -1.4104% -0.8534%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high severe

HAMT for_each function  time:   [100.01 µs 100.33 µs 100.69 µs]
                        change: [-2.8585% -2.3226% -1.7285%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

codecov-commenter · 2025-09-02T03:00:54Z

Codecov Report

❌ Patch coverage is 88.69565% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.61%. Comparing base (fe4d5c1) to head (e5c9ea9).

Files with missing lines	Patch %	Lines
ipld/hamt/src/iter.rs	87.09%	12 Missing ⚠️
ipld/hamt/src/pointer.rs	88.88%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #2215      +/-   ##
==========================================
+ Coverage   77.56%   77.61%   +0.04%     
==========================================
  Files         147      147              
  Lines       15789    15872      +83     
==========================================
+ Hits        12247    12319      +72     
- Misses       3542     3553      +11

Files with missing lines	Coverage Δ
ipld/hamt/src/hamt.rs	`97.16% <100.00%> (+0.01%)`	⬆️
ipld/hamt/src/lib.rs	`100.00% <ø> (ø)`
ipld/hamt/src/node.rs	`91.46% <100.00%> (+0.16%)`	⬆️
ipld/hamt/src/pointer.rs	`84.49% <88.88%> (-0.51%)`	⬇️
ipld/hamt/src/iter.rs	`89.31% <87.09%> (-3.00%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

rvagg · 2025-09-02T03:40:15Z

ipld/hamt/Cargo.toml

@@ -1,7 +1,7 @@
 [package]
 name = "fvm_ipld_hamt"
 description = "Sharded IPLD HashMap implementation."
-version = "0.10.4"
+version = "0.11.0"


do this as a separate PR

rvagg · 2025-09-02T03:41:39Z

"bulk insert" - how big are we talking here? if you make it huge you ought to see a larger difference in the iteration I think, although this really does come down to memory retention and management, so measuring that would be more interesting if that were at all possible

rvagg · 2025-09-02T03:44:11Z

it'd be better if we didn't go breaking the existing API, can you not easily do a similar approach as in #2189 of adding a new iteration method for this?

hanabi1224 · 2025-09-02T08:31:57Z

it'd be better if we didn't go breaking the existing API, can you not easily do a similar approach as in #2189 of adding a new iteration method for this?

Hey @rvagg Thanks for your quick review! I will use another PR to implement fn for_each_cacheless to avoid making breaking API changes. I will keep this PR as a draft since it makes Forest integration much easier with minimal changes (almost drop-in).

I'd like to discuss whether we plan to adopt this PR in the next minor version release and deprecate fn for_each_cacheless (same in fvm_ipld_amt) after the performance benefits are well tested and confirmed, because adopting the new fn for_each_cacheless API in Forest requires a lot of changes and the current cache-aware fn for_each does not seem to be beneficial to my understanding. cc @LesnyRumcajs

LesnyRumcajs · 2025-09-02T09:27:35Z

API in Forest requires a lot of changes and the current cache-aware fn for_each does not seem to be beneficial to my understanding

It might not be beneficial in the current Forest logic, but some users might depend on the caching for their use cases, so this is a breaking change. Refer to Hyrum's Law.

We might think about switching the backend to cacheless one eventually, but we'd need to ensure the benefits significantly outweigh potential risks. I'm especially concerned about potential performance degradation in builtin-actors on mainnet (but other consumers of the fvm_ipld_* are also important).

rvagg · 2025-09-03T00:04:10Z

This is going to have to be case-by-case, which is why it would be nice to have an opt-in form of this. I did originally imagine having some constructor option that would give you one that cached or didn't cache, but this is much more a question of idiomatic Rust and Rust ergonomics which I'm not necessarily the best person to give advise on.

But I do want to do this for Go to, and, I know its utility is mainly in certain areas, particularly where we know we're doing iteration on large data and we know that we're only iterating. Migrations is the big one for Go at least, we iterate through actors, we don't need random access, we drop it on the floor when we're done, the cache just gets in the way. I'm sure there's a number of APIs we serve too where it's pure straight iteration too where it would be helpful. But in the case where you instantiate one of these things and pass it around for general use - get, set, iterate, then a cache could be quite helpful.

hanabi1224 added 2 commits September 1, 2025 23:44

feat: cacheless iterator for hamt

bbdb34f

update tests

389a74d

github-project-automation bot added this to FilOz Sep 2, 2025

github-project-automation bot moved this to 📌 Triage in FilOz Sep 2, 2025

make clippy happy

dad03d3

hanabi1224 force-pushed the hm/hamt-cacheless-iter branch from 494060d to dad03d3 Compare September 2, 2025 03:20

versioni bump

e5c9ea9

rvagg reviewed Sep 2, 2025

View reviewed changes

BigLep moved this from 📌 Triage to ⌨️ In Progress in FilOz Sep 2, 2025

BigLep assigned hanabi1224 Sep 2, 2025

hanabi1224 changed the title ~~feat: cacheless hamt iteration~~ [Testing Only] cacheless hamt iteration Sep 4, 2025

hanabi1224 mentioned this pull request Sep 8, 2025

feat: cacheless hamt iteration #2216

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Testing Only] cacheless hamt iteration #2215

[Testing Only] cacheless hamt iteration #2215

Uh oh!

hanabi1224 commented Sep 2, 2025

Uh oh!

codecov-commenter commented Sep 2, 2025 •

edited

Loading

Uh oh!

rvagg Sep 2, 2025

Uh oh!

rvagg commented Sep 2, 2025

Uh oh!

rvagg commented Sep 2, 2025

Uh oh!

hanabi1224 commented Sep 2, 2025 •

edited

Loading

Uh oh!

LesnyRumcajs commented Sep 2, 2025

Uh oh!

rvagg commented Sep 3, 2025

Uh oh!

Uh oh!

[Testing Only] cacheless hamt iteration #2215

Are you sure you want to change the base?

[Testing Only] cacheless hamt iteration #2215

Uh oh!

Conversation

hanabi1224 commented Sep 2, 2025

Uh oh!

codecov-commenter commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

rvagg Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

rvagg commented Sep 2, 2025

Uh oh!

rvagg commented Sep 2, 2025

Uh oh!

hanabi1224 commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LesnyRumcajs commented Sep 2, 2025

Uh oh!

rvagg commented Sep 3, 2025

Uh oh!

Uh oh!

codecov-commenter commented Sep 2, 2025 •

edited

Loading

hanabi1224 commented Sep 2, 2025 •

edited

Loading