Randomize order of inputs from `OutputSweeper` #4033

benthecarman · 2025-08-29T00:01:38Z

For a marginal increase in privacy, we can randomize the inputs from the OutputSweeper. Since we don't depend on rand to randomize the order I just put the elements into a hashset, then back into a Vec. This should give us enough randomness without having to introduce a new dep or make the OutputSweeper depend on the EntropySource.

For a marginal increase in privacy, we can randomize the inputs from the `OutputSweeper`. Since we don't depend on `rand` to randomize the order I just put the elements into a hashset, then back into a Vec. This should give us enough randomness without having to introduce a new dep or make the `OutputSweeper` depend on the `EntropySource`.

ldk-reviews-bot · 2025-08-29T00:01:40Z

I've assigned @tnull as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

codecov · 2025-08-29T00:22:20Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.74%. Comparing base (a9bbb24) to head (8a84ed0).
⚠️ Report is 40 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #4033   +/-   ##
=======================================
  Coverage   88.74%   88.74%           
=======================================
  Files         176      176           
  Lines      128845   128847    +2     
  Branches   128845   128847    +2     
=======================================
+ Hits       114340   114342    +2     
+ Misses      11912    11909    -3     
- Partials     2593     2596    +3

Flag	Coverage Δ
fuzzing	`21.93% <0.00%> (-0.01%)`	⬇️
tests	`88.57% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

vincenzopalazzo

LGMT

shaavan

I was digging into the difference between using EntropySource-based shuffling versus the current Vec → HashSet → Vec approach, and it seems like the latter might not be the best fit for what we’re aiming to do.

On Randomness:

It’s not truly random. There’s no guarantee that all permutations are equally likely, which matters if the goal is to introduce meaningful entropy (for example, for privacy).
Since the order comes from internal hashing behavior, there’s no way to seed or reproduce the order for testing.
It replaces a well-understood and tested shuffling algorithm with a non-deterministic, opaque alternative. This might behave unpredictably across platforms or versions, which could potentially introduce subtle bugs over time.

If randomness does matter here, maybe it’s worth considering introducing EntropySource into OutputSweeper as a longer-term solution.

On Deduplication:

Also wondering: is it worth introducing an additional allocation (via HashSet) just to remove duplicates? If duplicates are actually a problem, maybe we can address that more explicitly, especially considering the relatively small size of the input in most cases.

Happy to discuss further if I’m missing context. Just wanted to flag these points as they came up while reviewing.

ldk-reviews-bot · 2025-09-01T00:00:31Z

🔔 1st Reminder

Hey @tnull! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

tnull

I agree with previous reviewers that we should probably just take on an EntropySource argument to OutputSweeper, especially since we're gonna need it when doing 'proper' batch randomization going forward.

TheBlueMatt · 2025-09-01T19:21:12Z

I disagree here, I think using the default HashMap randomization is perfectly acceptable for randomizing order (when its not critical). It may not be perfect to pay the allocation tax, though.

To correct the original misconceptions:

It’s not truly random. There’s no guarantee that all permutations are equally likely, which matters if the goal is to introduce meaningful entropy (for example, for privacy).

It may not be guaranteed, no, but it is in practice. In practice the values are SipHash-1-3 of the values, seeded with an actual random input from getrandom. That should make everything equally likely, as far as we're concerned.

especially since we're gonna need it when doing 'proper' batch randomization going forward.

Wait, what is "proper" batch randomization?

tnull · 2025-09-02T07:39:53Z

Wait, what is "proper" batch randomization?

Currently, OutputSweeper in each broadcast round still attempts to batch all tracked outputs, which has some potential for all sweeps getting stuck if there's at least one unspendable output in there. I still want to implement randomized subset sampling, possibly tracking when which output spend was attempted last, to over time guarantee we'd eventually spend all spendable outputs, even if there are some tracked that are currently unspendable.

TheBlueMatt · 2025-09-02T13:55:21Z

I don't see why that needs a "real" RNG either? Just a counter and hash would suffice for that (given the behavior is, first, "include all", then do random inclusion sets).

tnull · 2025-09-04T11:43:21Z

I don't see why that needs a "real" RNG either? Just a counter and hash would suffice for that (given the behavior is, first, "include all", then do random inclusion sets).

Yeah, we don't necessarily need a 'real' RNG for that, that's true.

tnull

Gonna move forward with this, we can still improve things further when revisiting batch randomization.

ldk-reviews-bot requested a review from tnull August 29, 2025 00:11

vincenzopalazzo approved these changes Aug 29, 2025

View reviewed changes

shaavan reviewed Aug 30, 2025

View reviewed changes

tnull reviewed Sep 1, 2025

View reviewed changes

tnull approved these changes Sep 4, 2025

View reviewed changes

tnull merged commit ec54ea0 into lightningdevkit:main Sep 4, 2025
25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Randomize order of inputs from `OutputSweeper` #4033

Randomize order of inputs from `OutputSweeper` #4033

benthecarman commented Aug 29, 2025

Uh oh!

ldk-reviews-bot commented Aug 29, 2025 •

edited

Loading

Uh oh!

codecov bot commented Aug 29, 2025 •

edited

Loading

Uh oh!

vincenzopalazzo left a comment

Uh oh!

shaavan left a comment

Uh oh!

ldk-reviews-bot commented Sep 1, 2025

Uh oh!

tnull left a comment •

edited

Loading

Uh oh!

TheBlueMatt commented Sep 1, 2025 •

edited

Loading

Uh oh!

tnull commented Sep 2, 2025

Uh oh!

TheBlueMatt commented Sep 2, 2025

Uh oh!

tnull commented Sep 4, 2025

Uh oh!

tnull left a comment

Uh oh!

Uh oh!

Uh oh!

Randomize order of inputs from OutputSweeper #4033

Randomize order of inputs from OutputSweeper #4033

Conversation

benthecarman commented Aug 29, 2025

Uh oh!

ldk-reviews-bot commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

vincenzopalazzo left a comment

Choose a reason for hiding this comment

Uh oh!

shaavan left a comment

Choose a reason for hiding this comment

Uh oh!

ldk-reviews-bot commented Sep 1, 2025

Uh oh!

tnull left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tnull commented Sep 2, 2025

Uh oh!

TheBlueMatt commented Sep 2, 2025

Uh oh!

tnull commented Sep 4, 2025

Uh oh!

tnull left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Randomize order of inputs from `OutputSweeper` #4033

Randomize order of inputs from `OutputSweeper` #4033

ldk-reviews-bot commented Aug 29, 2025 •

edited

Loading

codecov bot commented Aug 29, 2025 •

edited

Loading

tnull left a comment •

edited

Loading

TheBlueMatt commented Sep 1, 2025 •

edited

Loading